Difference between revisions of "UniProt Paper - Gene - Data Type"
Line 17: | Line 17: | ||
WBGene WBPaperID PMID Category | WBGene WBPaperID PMID Category | ||
− | |||
− | |||
− | |||
The Categories would be gene-specific and we will supply information for: | The Categories would be gene-specific and we will supply information for: | ||
GO:PPI;Phenotype;Disease;Expression;Sequence | GO:PPI;Phenotype;Disease;Expression;Sequence | ||
+ | |||
+ | An example: | ||
+ | |||
+ | WBGene00003508 WBPaper | ||
+ | |||
+ | |||
Strategy: Several possible strategies - not sure which is best. | Strategy: Several possible strategies - not sure which is best. | ||
+ | |||
+ | Easiest to get everything from WS or a mixture of WS and postgres? | ||
+ | |||
+ | Some things, like GO, RNAi and Variation Phenotypes, need to be from WS | ||
+ | |||
1) Start with Paper object and then trace the data types in the Refers_to tag - this works for everything but Disease | 1) Start with Paper object and then trace the data types in the Refers_to tag - this works for everything but Disease | ||
+ | |||
2) Look at each object in each relevant class - this seems computationally very intensive | 2) Look at each object in each relevant class - this seems computationally very intensive | ||
Revision as of 19:21, 19 May 2015
Original file generated for UniProt:
http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/uniprot.cgi
Updates to file will now include data types curated for a given gene.
What we currently supply:
WBGene WBPaperID PMID
Not sure how this is generated - before my time.
What we need to add:
WBGene WBPaperID PMID Category
The Categories would be gene-specific and we will supply information for:
GO:PPI;Phenotype;Disease;Expression;Sequence
An example:
WBGene00003508 WBPaper
Strategy: Several possible strategies - not sure which is best.
Easiest to get everything from WS or a mixture of WS and postgres?
Some things, like GO, RNAi and Variation Phenotypes, need to be from WS
1) Start with Paper object and then trace the data types in the Refers_to tag - this works for everything but Disease
2) Look at each object in each relevant class - this seems computationally very intensive
How to map this onto our data types from each WS release:
GO:
?GO_annotation -> Gene -> Reference
PPI:
?Interaction -> Interaction_type Physical -> Interactor_overlapping_gene -> Paper
Phenotype:
?RNAi -> Inhibits -> Phenotype -> Reference
?Variation -> Affects -> Phenotype -> Reference
Expression:
?Expr_pattern -> Expression_of -> Gene -> Reference
Sequence:
?Variation -> Affects
-> Nonsense -> Missense -> Silent Any one of these filled in -> Splice_site -> Frameshift -> Readthrough
-> Reference
Disease:
?Gene -> Disease_info -> Experimental -> Evidence -> Paper_evidence -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence