Difference between revisions of "UniProt Paper - Gene - Data Type"
Line 24: | Line 24: | ||
GO:PPI;Phenotype;Disease;Expression;Sequence | GO:PPI;Phenotype;Disease;Expression;Sequence | ||
+ | |||
+ | |||
+ | Strategy: Several possible strategies - not sure which is best. | ||
+ | 1) Start with Paper object and then trace the data types in the Refers_to tag - this works for everything but Disease | ||
+ | 2) Look at each object in each relevant class - this seems computationally very intensive | ||
Line 49: | Line 54: | ||
-> Reference | -> Reference | ||
− | |||
− | |||
− | |||
− | |||
Expression: | Expression: | ||
Line 70: | Line 71: | ||
-> Reference | -> Reference | ||
+ | |||
+ | Disease: | ||
+ | ?Gene -> Disease_info -> Experimental -> Evidence -> Paper_evidence | ||
+ | -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence |
Revision as of 19:11, 19 May 2015
Original file generated for UniProt:
http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/uniprot.cgi
Updates to file will now include data types curated for a given gene.
What we currently supply:
WBGene WBPaperID PMID
Not sure how this is generated - before my time.
What we need to add:
WBGene WBPaperID PMID Category
Easiest to get everything from WS or a mixture of WS and postgres?
Some things, like GO, RNAi and Variation Phenotypes, need to be from WS
The Categories would be gene-specific and we will supply information for:
GO:PPI;Phenotype;Disease;Expression;Sequence
Strategy: Several possible strategies - not sure which is best.
1) Start with Paper object and then trace the data types in the Refers_to tag - this works for everything but Disease
2) Look at each object in each relevant class - this seems computationally very intensive
How to map this onto our data types from each WS release:
GO:
?GO_annotation -> Gene -> Reference
PPI:
?Interaction -> Interaction_type Physical -> Interactor_overlapping_gene -> Paper
Phenotype:
?RNAi -> Inhibits -> Phenotype -> Reference
?Variation -> Affects -> Phenotype -> Reference
Expression:
?Expr_pattern -> Expression_of -> Gene -> Reference
Sequence:
?Variation -> Affects
-> Nonsense -> Missense -> Silent Any one of these filled in -> Splice_site -> Frameshift -> Readthrough
-> Reference
Disease:
?Gene -> Disease_info -> Experimental -> Evidence -> Paper_evidence -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence