Phenotype2GO Analysis

From WormBaseWiki
Jump to navigationJump to search

Named Genes with Phenotype2GO-Based Annotations

Gene Name Phenotype2GO Annotation Evidence Code Experiment Reference Manual GO Annotation Evidence Code Comment Recommend Keep Phenotype2GO Annotation? Class of P2GO
hsf-1 locomotion (GO:0040011) IMP RNAi in rrf-3 6395, Simmer et al., 2003, others No manual annotations from 6395 May reflect role in regulating transcription - what are the targets? No manual annotations to locomotion-related terms No Possible downstream effect
met-1 locomotion (GO:0040011) IMP RNAi 26635, Gottschalk, et al., 2005 No manual GO annotation from 26635 Role of met-1 not clear from this paper, but as a histone methyltransferase probably related to gene expression; manual annotations exist to cellular level processes and one vulval development annotation No Possible downstream effect
sax-3 locomotion (GO:0040011) IMP RNAi 5654, Kamath et al., 2003 No manual GO annotation from 5654 sax-3 plays a role in neuron and muscle migration guidance, manual annotations exist to axon guidance, neuron migration No Downstream effect
skn-1 embryonic development (GO:0009790) IMP RNAi 28492, Maduro et al., 2007 endodermal cell fate specification IMP P2GO annotation is correct, but less granular than manual annotation Not necessary - manual exists High-level term
smg-8 locomotion (GO:0040011) IMP RNAi 37111, Izumi et al., 2010 No manual annotation from 37111 smg-8 encodes a novel protein; only CC annotations exist No Downstream effect: phenotype used as output for more general cellular process: effect on locomotion apparently a result of unc-54 transcript stabilization; also incorrect evidence code
mes-4 determination of adult lifespan (GO:0008430) IMP RNAi 33449, Curran et al., 2009 No manual GO annotation for mes-4 from 33449 Affect on lifespan may be due to role in regulation of germline transcription, germ cell survival No Possible downstream effect - MES-4 is required for germ cell fate specification
gld-3 apoptosis (obsolete) IMP RNAi 38381 maybe a germ cell development annotation? IMP
fem-3 embryonic development ending in birth or egg hatching (GO:0009792) IMP RNAi 5599 large-scale screen small-scale experiment supports this but is not yet annotated (35459)

Goals and strategy moving forward

  1. Remove any annotation redundancy in the pipeline, i.e., papers have been curated both manually and via automated pipeline
  2. Determine how many genes only have Phenotype2GO-based GO annotations - this will give us an idea of what we'd lose if we pull the Phenotype2GO annotations
  3. Based on #2, devise a strategy for going forward, i.e. remove all automated annotations, keep some but remove most egregious mapping and change EC to IEA, put non-redundant, but automated IMP or IEA annotations in separate folder in GO (follow-up with CM on this), prioritize genes for manual curation, any combination of the above or other ideas?


From the Phenotype2GO (P2GO) annotation file we need the following information:

  • How many unique genes have P2GO annotations
  • How many of these genes have a manual or IEA (Process) annotation
  • How many genes have only P2GO annotations (will be obtained from the above)
  • Input files:
    • manual.go
    • electronic.go
    • rnai.go
    • variation???
    • external groups, i.e. UniProtKB??? Need to convert UniProtKB ids to WBGene IDs??
  • Rough specs for some scripts to help generate numbers and hopefully strategy
    • From rnai.go file generate list of unique WBGene IDs
    • Using list of unique WBGene IDs from rnai.go, check manual.go file for overlapping WBGenes with a P in Column 9
    • Using list of unique WBGene IDs from rnai.go, check uniprotkb.go file for overlapping WBGenes with a P in Column 9
    • Using list of unique WBGene IDs from rnai.go, check electronic.go file for overlapping WBGenes with a P in Column 9
    • For each file checked, output the list of WBGene IDs that do NOT overlap
    • Compare each output list and determine
      • Genes with rnai.go but no other annotations from 3 other files
      • Genes with rnai.go but no other annotations from each pair of files (i.e., no manual and no uniprotkb, but electronic)
    • Sort list of WBGenes in each file based on number of unique phenotype associations in phenotype gaf, descending order
      • This will give us a list of genes sorted according to their 'phenotype' information content and might help prioritize





  • Note: at least one paper, WBPaper000