Gene Ontology

From WormBaseWiki
Revision as of 18:25, 28 September 2010 by Rkishore (talk | contribs) (→‎New Terms)
Jump to navigationJump to search

Manual Literature Curation

  1. Lung Development Targets (November 2009 - February 2010)

Semi-Automated Methods of Curation

Textpresso-Based Curation

  • CCC - GO Cellular Component Curation using Textpresso
  1. dictyBase
  2. FlyBase
  3. TAIR
  4. WormBase
  • MFC - GO Molecular Function Curation using Textpresso

Phenotype2GO pipeline (Sanger and Caltech)

  • The old Sanger script that generates the gene_association file (from Igor's work in January 2009) was changed. Instead of an exclusion list and 'include list' that comprises papers (mostly large scale genome-wide studies) is provided to the script. This list is curator approved and explicitly agreed upon for the propagation of GO terms to genes based on their RNAi phenotypes.
  • A new script is used, to use it invoke the script with the -includelist option, e.g.: Run parse_go_terms_new.pl -o gene_association.wb -rnai -include includelist.txt (this example only parses RNAi experiments, to generate full file, you should also give '-gene -var' options as before).
  • If you invoke it with '-acefile <filename>' option, the script will also generate Gene-GO_term connections derived from phenotypes. This is currently done by the phenotype procedure of the inherit_GO_terms.pl script.
  • The old script: inherit_GO_terms.pl does not consult any exclusion/inclusion files. To alter Sanger's version of parse_go_terms_new.pl, a patch file was provided.
  • Current status:From Igor's e-mail, March 2009: I don't think the phenotype option of the inherit_go_terms script has been disabled. The script should be run without the '-variation' option, but the gene_association file still has those. Try this:

grep -i wbpheno gene_association.WS200.wb.ce |grep -v RNAi This is now resolved.

InterPro2GO Mappings for IEA Annotations
Reference Genome Inferential Annotations

Software Developement: Tools and Scripts

WormBase contributions to Gene Ontology content

New Terms
  • regulation terms for octapamine/tyramine signaling involved in the response to food (2010)
  • alpha-tubulin acetylation (2010)
  • phagosome maturation involved in apoptotic cell clearance (2010)
  • neuropeptide receptor binding (2010)
  • striated muscle contraction involved in embryonic body morphogenesis (2010)
  • striated muscle myosin thick filament assembly (2010)
  • striated muscle paramyosin thick filament assembly (2010)
  • determination of left/right asymmetry in the nervous system (2010)
  • regulation of locomotion (including positive and negative regulation child terms) involved in locomotory behavior (2010) -
  • detoxification of arsenic (2010)
  • chondroitin sulfate proteoglycan binding (2010)
  • chondroitin sulfate binding (2010)
  • regulation (includes positive and negative regulation child terms) of nematode larval development (2010)
  • regulation of (includes positive and negative regulation terms) dauer larval development (2010)
  • phosphatidylserine exposure on apoptotic cell surface (2009)
  • regulation of synaptic vesicle priming (2008)
  • chloride-activated potassium channel activity (2008)
  • transdifferentiation (2008)
  • Regulation of ovulation terms (2008)
  • Process terms for gap junction proteins (2008)
  • piRNA and 21U-RNA terms (2008)
  • dense body (sensu Nematoda) cellular component term (2007)
  • GO:0000775, GO:0000779, GO:0000780
  • D/V and A/P axon guidance terms (2007)
  • palmitoyl-CoA 9-desaturase activity (2007)
  • response to hyperoxia (2007)
  • Cuticle component terms (2007)
  • response to anoxia (2007)
  • dynein light intermediate chain binding (2006)
  • Regulation terms for cell and nuclear division (2006)
  • Several child terms for apoptosis (2006)
  • Cilium terms (2005)
  • Intraflagellar transport particle-component terms (2004)
  • oogenesis (non-species specific term)(2004)
Modifications to the Ontology
  • Revised definition for muscle homeostasis (2010)
  • Added dense core vesicle synonym to dense core granule (2010)
  • Updated definition and moved parentage for intraflagellar transport (2009)
  • Added lethargus as synonym for sleep (2008)
  • Change to the definitions of the component terms: GO:0000775, GO:0000779, GO:0000780 which refer to the centromeres or chromosome, pericentric region (2007)
  • Change to parent of tail tip morphogenesis (sensu Nematoda) (2006)
  • GO:0046536, dosage compensation complex definition (2006)
Requests that are pending
  • response to drug withdrawel, 2010

Annotation Practices

1. When annotating to Cellular Component terms:

If a protein contains a transmembrane domain, but expression experiments are not at sufficient resolution to show membrane localization, what annotation should we make?

Example: WBPaper00036024

Plans/Projects in progress

Changes to the GO data model
  • Add tags for accommodating data in WormBase that are already in the gene association file:
    • Qualifying an annotation with the qualifiers 'NOT' 'contributes_to' or 'colocalizes with'
    • Using the generic GO_REF tags for generic references eg., for a NOT annotation, need to add the proper database and accession syntax (need to add a field in curation interface in OA).
    • 'With' or 'From', for the use of additional identifiers with the use of certain evidence codes like IPI, IGI, etc.
    • Annotation Extension, for containing cross references to other ontologies,one of:
      • DB:gene_id
      • DB:sequence_id
      • CHEBI:CHEBI_id
      • Cell Type Ontology:CL_id
      • GO:GO_id
    • Gene Product Form ID, a canonical entry for specific variants of gene products.
      • When the gene product form ID (column 17 of ga) is filled with a protein identifier, the value in DB object type (column 12 of ga) must be protein. Protein identifiers can include UniProtKB accession numbers, NCBI NP identifiers or Protein Ontology (PRO) identifiers.
      • When the gene product form ID (column 17 of ga) is filled with a functional RNA identifier, the DB object type (column 12 of ga) must be either ncRNA, rRNA, tRNA, snRNA, or snoRNA.
Changes to the GO_term model and updating the ontology in WormBase

Back to Caltech documentation