Difference between revisions of "UniProt Paper - Gene - Data Type"

From WormBaseWiki
Jump to navigationJump to search
Line 94: Line 94:
  
 
Disease:
 
Disease:
     ?Gene -> Disease_info -> Experimental -> Evidence -> Paper_evidence
+
     ?Gene -> Disease_info -> Experimental_model -> Evidence -> Paper_evidence
 
           -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence
 
           -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence

Revision as of 15:31, 21 May 2015

Original file generated for UniProt:

http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/uniprot.cgi

What we currently supply:

WBGene WBPaperID PMID

Not sure how this is generated; I believe this was done before my time as paper curator.


Proposed updates to file will now include data types curated for a given gene.

We would need to add:

WBGene WBPaperID PMID Category


The Categories would be gene-specific and we will supply information for:

GO

PPI (Protein-Protein Interaction)

Phenotype

Disease

Expression

Sequence


An example:

    WBGene00003508  WBPaper00003680  pmid10517638  GO;Phenotype;Disease;Expression 



Strategy: Several possible strategies, perhaps - not sure which is best.

Easiest to get everything from WS or a mixture of WS and postgres?

Some things, like GO, RNAi and Variation Phenotypes, need to be from WS

Possibilities:

1) Start with Paper object and then trace the information in the objects xref'ed in the Refers_to tag - this works for everything but Disease

2) Look at each object in each relevant class - this seems computationally very intensive


Relevant tags in the different object models:


GO:

    ?GO_annotation -> Gene
                   -> Reference


PPI:

    ?Interaction -> Interaction_type Physical
                 -> Interactor_overlapping_gene
                 -> Paper


Phenotype:

    ?RNAi -> Inhibits
          -> Phenotype (Only Phenotype Observed, doesn't matter what the Phenotype is)
          -> Reference
    ?Variation -> Affects
               -> Phenotype (Only Phenotype Observed, doesn't matter what the Phenotype is)
               -> Reference


Expression:

    ?Expr_pattern -> Expression_of -> Gene
                  -> Reference


Sequence:

    ?Variation -> Affects
               -> Nonsense
               -> Missense
               -> Silent			Any one of these filled in
               -> Splice_site
               -> Frameshift
               -> Readthrough
               -> Reference

Disease:

    ?Gene -> Disease_info -> Experimental_model -> Evidence -> Paper_evidence
          -> Disease_info -> Disease_relevance -> Evidence -> Paper_evidence