- 1 Manual Literature Curation
- 2 Semi-Automated Methods of Curation
- 3 Software Developement: Tools and Scripts
- 4 WormBase contributions to Gene Ontology content
- 5 Annotation Practices
- 6 Plans/Projects in progress
Manual Literature Curation
- Reference Genome (see also Reference Genome Inferential Annotations)
Semi-Automated Methods of Curation
- CCC - GO Cellular Component Curation using Textpresso
- MFC - GO Molecular Function Curation using Textpresso
HMMs for Molecular Function - Enzymatic and Transporter Activities
Phenotype2GO pipeline (Sanger and Caltech)
- The old Sanger script that generates the gene_association file (from Igor's work in January 2009) was changed. Instead of an exclusion list and 'include list' that comprises papers (mostly large scale genome-wide studies) is provided to the script. This list is curator approved and explicitly agreed upon for the propagation of GO terms to genes based on their RNAi phenotypes.
- A new script is used, to use it invoke the script with the -includelist option, e.g.: Run parse_go_terms_new.pl -o gene_association.wb -rnai -include includelist.txt (this example only parses RNAi experiments, to generate full file, you should also give '-gene -var' options as before).
- If you invoke it with '-acefile <filename>' option, the script will also generate Gene-GO_term connections derived from phenotypes. This is currently done by the phenotype procedure of the inherit_GO_terms.pl script.
- The old script: inherit_GO_terms.pl does not consult any exclusion/inclusion files. To alter Sanger's version of parse_go_terms_new.pl, a patch file was provided.
- Current status:From Igor's e-mail, March 2009: I don't think the phenotype option of the inherit_go_terms script has been disabled. The script should be run without the '-variation' option, but the gene_association file still has those. Try this:
grep -i wbpheno gene_association.WS200.wb.ce |grep -v RNAi This is now resolved.
InterPro2GO Mappings for IEA Annotations
Reference Genome Inferential Annotations
Software Developement: Tools and Scripts
WormBase contributions to Gene Ontology content
- octapamine/tyramine signaling involved in the response to food (and the regulation terms) (2010)
- alpha-tubulin acetylation (2010)
- phagosome maturation involved in apoptotic cell clearance (2010)
- neuropeptide receptor binding (2010)
- striated muscle contraction involved in embryonic body morphogenesis (2010)
- striated muscle myosin thick filament assembly (2010)
- striated muscle paramyosin thick filament assembly (2010)
- determination of left/right asymmetry in the nervous system (2010)
- regulation of locomotion (including positive and negative regulation child terms) involved in locomotory behavior (2010)
- detoxification of arsenic (2010)
- chondroitin sulfate proteoglycan binding (2010)
- chondroitin sulfate binding (2010)
- regulation (includes positive and negative regulation child terms) of nematode larval development (2010)
- regulation of (includes positive and negative regulation terms) dauer larval development (2010)
- response to drug withdrawel (2009)
- phosphatidylserine exposure on apoptotic cell surface (2009)
- regulation of synaptic vesicle priming (2008)
- chloride-activated potassium channel activity (2008)
- transdifferentiation (2008)
- Regulation of ovulation terms (2008)
- Process terms for gap junction proteins (2008)
- piRNA and 21U-RNA terms (2008)
- dense body (sensu Nematoda) cellular component term (2007)
- GO:0000775, GO:0000779, GO:0000780
- D/V and A/P axon guidance terms (2007)
- palmitoyl-CoA 9-desaturase activity (2007)
- response to hyperoxia (2007)
- Cuticle component terms (2007)
- response to anoxia (2007)
- dynein light intermediate chain binding (2006)
- Regulation terms for cell and nuclear division (2006)
- Several child terms for apoptosis (2006)
- Cilium terms (2005)
- Intraflagellar transport particle-component terms (2004)
- oogenesis (non-species specific term)(2004)
Modifications to the Ontology
- Revised definition for muscle homeostasis (2010)
- Added dense core vesicle synonym to dense core granule (2010)
- Updated definition and moved parentage for intraflagellar transport (2009)
- Added lethargus as synonym for sleep (2008)
- Change to the definitions of the component terms: GO:0000775, GO:0000779, GO:0000780 which refer to the centromeres or chromosome, pericentric region (2007)
- Change to parent of tail tip morphogenesis (sensu Nematoda) (2006)
- GO:0046536, dosage compensation complex definition (2006)
1. When annotating to Cellular Component terms:
If a protein contains a transmembrane domain, but expression experiments are not at sufficient resolution to show membrane localization, what annotation should we make?
Plans/Projects in progress
Changes to the GO data model
- Add tags for accommodating data in WormBase that are already in the gene association file:
- Qualifying an annotation with the qualifiers 'NOT' 'contributes_to' or 'colocalizes with'
- Using the generic GO_REF tags for generic references eg., for a NOT annotation, need to add the proper database and accession syntax (need to add a field in curation interface in OA).
- 'With' or 'From', for the use of additional identifiers with the use of certain evidence codes like IPI, IGI, etc.
- Annotation Extension, for containing cross references to other ontologies,one of:
- Cell Type Ontology:CL_id
- Gene Product Form ID, a canonical entry for specific variants of gene products.
- When the gene product form ID (column 17 of ga) is filled with a protein identifier, the value in DB object type (column 12 of ga) must be protein. Protein identifiers can include UniProtKB accession numbers, NCBI NP identifiers or Protein Ontology (PRO) identifiers.
- When the gene product form ID (column 17 of ga) is filled with a functional RNA identifier, the DB object type (column 12 of ga) must be either ncRNA, rRNA, tRNA, snRNA, or snoRNA.
Changes to the GO_term model and updating the ontology in WormBase
Back to Caltech documentation