Difference between revisions of "Phenotype2GO Analysis"
From WormBaseWiki
Jump to navigationJump to searchLine 40: | Line 40: | ||
**rnai.go | **rnai.go | ||
**variation??? | **variation??? | ||
+ | **external groups, i.e. UniProtKB??? Need to convert UniProtKB ids to WBGene IDs?? | ||
*Rough specs for some scripts to help generate numbers and hopefully strategy | *Rough specs for some scripts to help generate numbers and hopefully strategy | ||
+ | **From rnai.go file generate list of unique WBGene IDs | ||
+ | **Using list of unique WBGene IDs from rnai.go, check manual.go file for overlapping WBGenes with a P in Column 9 | ||
+ | **Using list of unique WBGene IDs from rnai.go, check uniprotkb.go file for overlapping WBGenes with a P in Column 9 | ||
+ | **Using list of unique WBGene IDs from rnai.go, check electronic.go file for overlapping WBGenes with a P in Column 9 | ||
+ | **For each file checked, output the list of WBGene IDs that do NOT overlap | ||
+ | **Compare each output list and determine | ||
+ | ***Genes with rnai.go but no other annotations from 3 other files | ||
+ | ***Genes with rnai.go but no other annotations from each pair of files (i.e., no manual and no uniprotkb, but electronic) | ||
+ | **Sort list of WBGenes in each file based on number of unique phenotype associations in phenotype gaf, descending order | ||
+ | ***This will give us a list of genes sorted according to their 'phenotype' information content and might help prioritize | ||
+ | |||
+ | |||
Revision as of 20:13, 20 June 2013
Named Genes with Phenotype2GO-Based Annotations
Gene Name | Phenotype2GO Annotation | Evidence Code | Experiment | Reference | Manual GO Annotation | Evidence Code | Comment | Recommend Keep Phenotype2GO Annotation? | Class of P2GO |
---|---|---|---|---|---|---|---|---|---|
hsf-1 | locomotion (GO:0040011) | IMP | RNAi in rrf-3 | 6395, Simmer et al., 2003, others | No manual annotations from 6395 | May reflect role in regulating transcription - what are the targets? No manual annotations to locomotion-related terms | No | Possible downstream effect | |
met-1 | locomotion (GO:0040011) | IMP | RNAi | 26635, Gottschalk, et al., 2005 | No manual GO annotation from 26635 | Role of met-1 not clear from this paper, but as a histone methyltransferase probably related to gene expression; manual annotations exist to cellular level processes and one vulval development annotation | No | Possible downstream effect | |
sax-3 | locomotion (GO:0040011) | IMP | RNAi | 5654, Kamath et al., 2003 | No manual GO annotation from 5654 | sax-3 plays a role in neuron and muscle migration guidance, manual annotations exist to axon guidance, neuron migration | No | Downstream effect | |
skn-1 | embryonic development (GO:0009790) | IMP | RNAi | 28492, Maduro et al., 2007 | endodermal cell fate specification | IMP | P2GO annotation is correct, but less granular than manual annotation | Not necessary - manual exists | High-level term |
smg-8 | locomotion (GO:0040011) | IMP | RNAi | 37111, Izumi et al., 2010 | No manual annotation from 37111 | smg-8 encodes a novel protein; only CC annotations exist | No | Downstream effect: phenotype used as output for more general cellular process: effect on locomotion apparently a result of unc-54 transcript stabilization; also incorrect evidence code | |
mes-4 | determination of adult lifespan (GO:0008430) | IMP | RNAi | 33449, Curran et al., 2009 | No manual GO annotation for mes-4 from 33449 | Affect on lifespan may be due to role in regulation of germline transcription, germ cell survival | No | Possible downstream effect - MES-4 is required for germ cell fate specification | |
gld-3 | apoptosis (obsolete) | IMP | RNAi | 38381 | maybe a germ cell development annotation? | IMP | |||
fem-3 | embryonic development ending in birth or egg hatching (GO:0009792) | IMP | RNAi | 5599 | large-scale screen | small-scale experiment supports this but is not yet annotated (35459) |
Goals and strategy moving forward
- Remove any annotation redundancy in the pipeline, i.e., papers have been curated both manually and via automated pipeline
- Determine how many genes only have Phenotype2GO-based GO annotations - this will give us an idea of what we'd lose if we pull the Phenotype2GO annotations
- Based on #2, devise a strategy for going forward, i.e. remove all automated annotations, keep some but remove most egregious mapping and change EC to IEA, put non-redundant, but automated IMP or IEA annotations in separate folder in GO (follow-up with CM on this), prioritize genes for manual curation, any combination of the above or other ideas?
From the Phenotype2GO (P2GO) annotation file we need the following information:
- How many unique genes have P2GO annotations
- How many of these genes have a manual or IEA (Process) annotation
- How many genes have only P2GO annotations (will be obtained from the above)
- Input files:
- manual.go
- electronic.go
- rnai.go
- variation???
- external groups, i.e. UniProtKB??? Need to convert UniProtKB ids to WBGene IDs??
- Rough specs for some scripts to help generate numbers and hopefully strategy
- From rnai.go file generate list of unique WBGene IDs
- Using list of unique WBGene IDs from rnai.go, check manual.go file for overlapping WBGenes with a P in Column 9
- Using list of unique WBGene IDs from rnai.go, check uniprotkb.go file for overlapping WBGenes with a P in Column 9
- Using list of unique WBGene IDs from rnai.go, check electronic.go file for overlapping WBGenes with a P in Column 9
- For each file checked, output the list of WBGene IDs that do NOT overlap
- Compare each output list and determine
- Genes with rnai.go but no other annotations from 3 other files
- Genes with rnai.go but no other annotations from each pair of files (i.e., no manual and no uniprotkb, but electronic)
- Sort list of WBGenes in each file based on number of unique phenotype associations in phenotype gaf, descending order
- This will give us a list of genes sorted according to their 'phenotype' information content and might help prioritize
- Note: at least one paper, WBPaper000