Difference between revisions of "UniProt-GOA syntax checking"

From WormBaseWiki
Jump to navigationJump to search
Line 62: Line 62:
  
 
==gp2protein File==
 
==gp2protein File==
#Need a version updated as often as possible - UniProtKB can upload nightly
+
*Need a version updated as often as possible to keep IDs as closely in sync as possible.
 +
*UniProtKB can upload file nightly.
 +
#Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized
 +
 
 
==Unsupported/Missing Reference==
 
==Unsupported/Missing Reference==
 
*Including doi's instead would be fine, if available
 
*Including doi's instead would be fine, if available

Revision as of 13:37, 9 August 2012

Errors to Fix

With/From Column

  1. IC annotations need GO term in With/From - DONE
  2. ISS annotations need database identifiers in With/From
    1. 917 ISS annotations in postgres as of 2012-07-23
    2. Some annotations use transgenes in the WITH column - this syntax should be okay, see jnk-1 and sir-2.1 for examples

Phenotype2GO Pipeline

  1. Remove annotations mapping to ncRNA and pseudogenes

IEA Pipelines

  1. UniProtKB will perform InterPro2GO mappings in-house
  2. TMHMM-derived annotations need a resolvable accession in With/From column
    1. Is there an accession for TMHMM?
    2. If not, what could be used in place of this pipeline if we stand to lose annotations?

Annotations to ncRNAs

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.

Annotations to Uncloned Genes

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
  • Genes affected (partial list):
  • cad-1
  • exc-1
  • exc-2
  • exc-3
  • exc-6
  • exc-8
  • hid-2
  • hid-4
  • ric-1
  • seu-2
  • seu-3
  • sog-1
  • sog-2
  • sog-3
  • sog-4
  • sog-5
  • sog-6
  • sog-10
  • szy-1
  • szy-2
  • szy-3
  • szy-4
  • szy-5
  • szy-6
  • szy-7
  • szy-8
  • szy-9
  • szy-10
  • szy-11
  • szy-12
  • szy-13
  • szy-14
  • szy-15
  • szy-16
  • szy-17
  • szy-18
  • szy-19
  • unc-65

gp2protein File

  • Need a version updated as often as possible to keep IDs as closely in sync as possible.
  • UniProtKB can upload file nightly.
  1. Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized

Unsupported/Missing Reference

  • Including doi's instead would be fine, if available
  1. WBPaper00004663 - added doi in paper editor, will need to dump doi in gaf


  • Some GO references still refer to meeting abstracts
  1. WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation. DONE.
  2. WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. DONE. Added an IGI annotation with ced-3 from WBPaper00003815.
  3. WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. DONE. Added an IC annotation with GO:0043565 in WITH/FROM.
  4. WBPaper00018934 ceh-37 meeting abstract. R.


  • Annotations from P2GO pipeline that reference a paper's erratum, not the original paper
  1. WBPaper00006304 should be WBPaper00005637


  • Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
  1. WBPaper00005149 should be merged into WBPaper00005123


Back to Gene Ontology