Difference between revisions of "UniProt Paper - Gene - Data Type"
Line 2: | Line 2: | ||
http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/uniprot.cgi | http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/uniprot.cgi | ||
− | |||
− | |||
− | |||
What we currently supply: | What we currently supply: | ||
Line 10: | Line 7: | ||
WBGene WBPaperID PMID | WBGene WBPaperID PMID | ||
− | Not sure how this is generated | + | Not sure how this is generated; I believe this was done before my time as paper curator. |
− | + | Proposed updates to file will now include data types curated for a given gene. | |
+ | |||
+ | We would need to add: | ||
WBGene WBPaperID PMID Category | WBGene WBPaperID PMID Category | ||
Line 20: | Line 19: | ||
The Categories would be gene-specific and we will supply information for: | The Categories would be gene-specific and we will supply information for: | ||
− | GO | + | GO |
+ | |||
+ | PPI (Protein-Protein Interaction) | ||
+ | |||
+ | Phenotype | ||
+ | |||
+ | Disease | ||
+ | |||
+ | Expression | ||
+ | |||
+ | Sequence | ||
+ | |||
An example: | An example: | ||
Line 29: | Line 39: | ||
− | Strategy: Several possible strategies - not sure which is best. | + | Strategy: Several possible strategies, perhaps - not sure which is best. |
Easiest to get everything from WS or a mixture of WS and postgres? | Easiest to get everything from WS or a mixture of WS and postgres? | ||
Line 35: | Line 45: | ||
Some things, like GO, RNAi and Variation Phenotypes, need to be from WS | Some things, like GO, RNAi and Variation Phenotypes, need to be from WS | ||
− | 1) Start with Paper object and then trace the | + | Possibilities: |
+ | |||
+ | 1) Start with Paper object and then trace the information in the objects xref'ed in the Refers_to tag - this works for everything but Disease | ||
2) Look at each object in each relevant class - this seems computationally very intensive | 2) Look at each object in each relevant class - this seems computationally very intensive | ||
− | + | Relevant tags in the different object models: | |
Line 60: | Line 72: | ||
?Variation -> Affects | ?Variation -> Affects | ||
− | + | -> Phenotype | |
− | + | -> Reference | |
Revision as of 18:21, 20 May 2015
Original file generated for UniProt:
http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/uniprot.cgi
What we currently supply:
WBGene WBPaperID PMID
Not sure how this is generated; I believe this was done before my time as paper curator.
Proposed updates to file will now include data types curated for a given gene.
We would need to add:
WBGene WBPaperID PMID Category
The Categories would be gene-specific and we will supply information for:
GO
PPI (Protein-Protein Interaction)
Phenotype
Disease
Expression
Sequence
An example:
WBGene00003508 WBPaper00003680 pmid10517638 GO;Phenotype;Disease;Expression
Strategy: Several possible strategies, perhaps - not sure which is best.
Easiest to get everything from WS or a mixture of WS and postgres?
Some things, like GO, RNAi and Variation Phenotypes, need to be from WS
Possibilities:
1) Start with Paper object and then trace the information in the objects xref'ed in the Refers_to tag - this works for everything but Disease
2) Look at each object in each relevant class - this seems computationally very intensive
Relevant tags in the different object models:
GO:
?GO_annotation -> Gene -> Reference
PPI:
?Interaction -> Interaction_type Physical -> Interactor_overlapping_gene -> Paper
Phenotype:
?RNAi -> Inhibits -> Phenotype -> Reference
?Variation -> Affects -> Phenotype -> Reference
Expression:
?Expr_pattern -> Expression_of -> Gene -> Reference
Sequence:
?Variation -> Affects
-> Nonsense -> Missense -> Silent Any one of these filled in -> Splice_site -> Frameshift -> Readthrough
-> Reference
Disease:
?Gene -> Disease_info -> Experimental -> Evidence -> Paper_evidence -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence