UniProt Paper - Gene - Data Type
Original file generated for UniProt:
What we currently supply:
WBGene WBPaperID PMID
Not sure how this is generated; I believe this was done before my time as paper curator.
Proposed updates to file will now include data types curated for a given gene.
We would need to add:
WBGene WBPaperID PMID Category
The Categories would be gene-specific and we will supply information for:
PPI (Protein-Protein Interaction)
WBGene00003508 WBPaper00003680 pmid10517638 GO;Phenotype;Disease;Expression
Strategy: Several possible strategies, perhaps - not sure which is best.
Easiest to get everything from WS or a mixture of WS and postgres?
Some things, like GO, RNAi and Variation Phenotypes, need to be from WS
1) Start with Paper object and then trace the information in the objects xref'ed in the Refers_to tag - this works for everything but Disease
2) Look at each object in each relevant class - this seems computationally very intensive
Relevant tags in the different object models:
?GO_annotation -> Gene -> Reference
?Interaction -> Interaction_type Physical -> Interactor_overlapping_gene -> Paper
?RNAi -> Inhibits -> Phenotype (Only Phenotype Observed, doesn't matter what the Phenotype is) -> Reference
?Variation -> Affects -> Phenotype (Only Phenotype Observed, doesn't matter what the Phenotype is) -> Reference
?Expr_pattern -> Expression_of -> Gene -> Reference
?Variation -> Affects
-> Nonsense -> Missense -> Silent Any one of these filled in -> Splice_site -> Frameshift -> Readthrough
?Gene -> Disease_info -> Experimental -> Evidence -> Paper_evidence -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence