Phasing out the manual annotations

From WormBaseWiki
Revision as of 19:11, 19 February 2016 by Rkishore (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

The plan is to look at groups of annotations by date and/or their text in order to phase them out from the concise descriptions dump that is submitted for upload (we may need to delete from Postgres as well, eventually).

Note: These are only descriptions of the Description Type 'Concise_description', in the OA, and not 'Provisional_description'.

  • Those that have a Date last updated date of 2004-06-17 in the OA, 3075 genes
  • Those genes that have the description below (also from 2004-06-17)
This gene encodes a protein containing an F-box, a motif predicted to mediate 
protein- protein interactions either with homologs of yeast Skp-1p or with other  
proteins.

Examples: fbxb genes, fbxa genes, a few sdz genes, number of unnamed (uncloned?) genes, total 295 genes

  • Those genes with the description below (most are from 2004-06-17)
The protein product of this gene is predicted to contain a glutamine/asparagine  
(Q/N)-rich ('prion')
domain, by the algorithm of Michelitsch and Weissman (as of the WS77 release of  
WormBase, i.e., in wormpep77).

Examples: pqn genes, some abu genes, some unnamed (uncloned?) genes, total 72 genes

  • Those genes that have the word 'disease' in the description (113 genes), will need to check if these can be represented in human disease model data
hex-1 encodes a beta-N-acetylhexosaminidase that is orthologous to the human gene   
CERVICAL CANCER PROTO-ONCOGENE 7 (HEXB; OMIM:606873), which when mutated leads to  
disease.
  • Those genes that have the description with the word 'syndrome' (85 genes), will need to check if these can be represented in human disease model data