20141022 - Phenotype2GO Pipeline

From WormBaseWiki
Revision as of 17:27, 22 October 2014 by Vanaukenk (talk | contribs) (→‎Papers)
Jump to navigationJump to search

This page summarizes how the Phenotype2GO-based GO annotations will be handled for the WormBase builds and GO uploads.

Source of Annotations

  • Phenotype2GO-based annotations are generated as part of the WB build based upon the presence of a phenotype associated with a perturbation of gene activity.
  • The perturbation may be an RNAi experiment (or a variation).
  • If the perturbation is an RNAi experiment, we will only associate the GO annotations with phenotypes resulting from disruption of the primary target. Secondary targets will not be used for Phenotype2GO annotations.
  • All papers will be included in the pipeline (see below for some details on this).
  • It will be up to Caltech to integrate the annotations into their curation database and perform any necessary editing and QC there.
  • Phenotype2GO annotations displayed in WB will come from the Caltech .ace file uploaded with each build.

Source of Phenotype2GO Mappings

  • All mappings between WB Phenotype Ontology and Gene Ontology terms are submitted during the build in a separate .ace file submitted by Ranjana.
  • No mappings are present in the ?Phenotype objects submitted separately during the build, as this would have the potential to overwrite mappings coming from the GO curators.

GAFs

  • The GAF generated as part of the WB build will contain all of the manual annotations and all IEA annotations.
  • The Phenotype2GO-based annotations will be in a separate GAF and will use the IEA evidence code. This GAF will be the basis for what Caltech uploads to postgres after each build.
  • Caltech will have to dump out a separate GAF from postgres for Phenotype2GO-based annotations to send to the GOC.

Papers

  • All papers will be included, but Caltech would like to exclude from the pipeline those phenotypes where the experiments reflect genetic interactions and not single-gene (presumed) phenotypes because we don't yet have a good way to express this correctly in the GAF.
  • For example, the Lehner et al., 2006 paper has a large number of RNAi experiments, but these are all in the background of a genetic variation. Thus, the phenotypes are believed to be the result of an interaction.
  • Could the Interaction tag in the ?RNAi object be used to exclude such experiments from the Phenotype2GO-based pipeline?
  • If Caltech wanted, at some point we could review these papers and upload the annotations in bulk into postgres using the IGI evidence code.