Difference between revisions of "UniProt-GOA syntax checking"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
 
=Errors to Fix=
 
=Errors to Fix=
 
==With/From Column==
 
==With/From Column==
#IC annotations need GO term in With/From - DONE
+
*IC annotations need GO term in With/From - DONE
#ISS annotations need database identifiers in With/From
+
*ISS annotations need database identifiers in With/From
##917 ISS annotations in postgres as of 2012-07-23
+
#917 ISS annotations in postgres as of 2012-07-23 - some are legacy annotations without With/From entry
##Some annotations use transgenes in the WITH column - this syntax should be okay, see jnk-1 and sir-2.1 for examples
+
*IMP annotations - some annotations use transgenes in the With/From column - this syntax should be okay, see jnk-1 and sir-2.1 for examples.  Action - discuss with Rachael, Tony
  
 
==Phenotype2GO Pipeline==
 
==Phenotype2GO Pipeline==

Revision as of 13:42, 9 August 2012

Errors to Fix

With/From Column

  • IC annotations need GO term in With/From - DONE
  • ISS annotations need database identifiers in With/From
  1. 917 ISS annotations in postgres as of 2012-07-23 - some are legacy annotations without With/From entry
  • IMP annotations - some annotations use transgenes in the With/From column - this syntax should be okay, see jnk-1 and sir-2.1 for examples. Action - discuss with Rachael, Tony

Phenotype2GO Pipeline

  1. Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion

IEA Pipelines

  1. UniProtKB will perform InterPro2GO mappings in-house
  2. TMHMM-derived annotations need a resolvable accession in With/From column
    1. Is there an accession for TMHMM?
    2. If not, what could be used in place of this pipeline if we stand to lose annotations?

Annotations to ncRNAs

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.

Annotations to Uncloned Genes

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
  • Genes affected (partial list):
  • cad-1
  • exc-1
  • exc-2
  • exc-3
  • exc-6
  • exc-8
  • hid-2
  • hid-4
  • ric-1
  • seu-2
  • seu-3
  • sog-1
  • sog-2
  • sog-3
  • sog-4
  • sog-5
  • sog-6
  • sog-10
  • szy-1
  • szy-2
  • szy-3
  • szy-4
  • szy-5
  • szy-6
  • szy-7
  • szy-8
  • szy-9
  • szy-10
  • szy-11
  • szy-12
  • szy-13
  • szy-14
  • szy-15
  • szy-16
  • szy-17
  • szy-18
  • szy-19
  • unc-65

gp2protein File

  • Need a version updated as often as possible to keep IDs as closely in sync as possible.
  • UniProtKB can upload file nightly.
  1. Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized

Unsupported/Missing Reference

  • Published papers without PMIDs - Including doi's instead would be fine, if available.
  1. WBPaper00004663 - added doi in paper editor, will need to dump doi in WB GAF


  • Some GO references still refer to meeting abstracts
  1. WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation. DONE.
  2. WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. DONE. Added an IGI annotation with ced-3 from WBPaper00003815.
  3. WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. DONE. Added an IC annotation with GO:0043565 in WITH/FROM.
  4. WBPaper00018934 ceh-37 meeting abstract. R.


  • Annotations from P2GO pipeline that reference a paper's erratum, not the original paper
  1. WBPaper00006304 should be WBPaper00005637


  • Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
  1. WBPaper00005149 should be merged into WBPaper00005123


Back to Gene Ontology