Difference between revisions of "UniProt-GOA syntax checking"

From WormBaseWiki
Jump to navigationJump to search
Line 8: Line 8:
 
==Phenotype2GO Pipeline==
 
==Phenotype2GO Pipeline==
 
#Remove annotations mapping to ncRNA and pseudogenes
 
#Remove annotations mapping to ncRNA and pseudogenes
==IEA Pipeline==
+
==IEA Pipelines==
 
#UniProtKB will perform InterPro2GO mappings in-house
 
#UniProtKB will perform InterPro2GO mappings in-house
 
#TMHMM-derived annotations need a resolvable accession in With/From column
 
#TMHMM-derived annotations need a resolvable accession in With/From column
 
##Is there an accession for TMHMM?
 
##Is there an accession for TMHMM?
 
##If not, what could be used in place of this pipeline if we stand to lose annotations?
 
##If not, what could be used in place of this pipeline if we stand to lose annotations?
 +
 
==Annotations to ncRNAs==
 
==Annotations to ncRNAs==
 
#Need a specific mapping file for those genes
 
#Need a specific mapping file for those genes

Revision as of 13:33, 9 August 2012

Errors to Fix

With/From Column

  1. IC annotations need GO term in With/From - DONE
  2. ISS annotations need database identifiers in With/From
    1. 917 ISS annotations in postgres as of 2012-07-23
    2. Some annotations use transgenes in the WITH column - this syntax should be okay, see jnk-1 and sir-2.1 for examples

Phenotype2GO Pipeline

  1. Remove annotations mapping to ncRNA and pseudogenes

IEA Pipelines

  1. UniProtKB will perform InterPro2GO mappings in-house
  2. TMHMM-derived annotations need a resolvable accession in With/From column
    1. Is there an accession for TMHMM?
    2. If not, what could be used in place of this pipeline if we stand to lose annotations?

Annotations to ncRNAs

  1. Need a specific mapping file for those genes
  • WBGene00003262 mir-34

Annotations to Uncloned Genes

  1. Need a specific mapping file for those genes
  2. Genes affected (partial list):
  • cad-1
  • exc-1
  • exc-2
  • exc-3
  • exc-6
  • exc-8
  • hid-2
  • hid-4
  • ric-1
  • seu-2
  • seu-3
  • sog-1
  • sog-2
  • sog-3
  • sog-4
  • sog-5
  • sog-6
  • sog-10
  • szy-1
  • szy-2
  • szy-3
  • szy-4
  • szy-5
  • szy-6
  • szy-7
  • szy-8
  • szy-9
  • szy-10
  • szy-11
  • szy-12
  • szy-13
  • szy-14
  • szy-15
  • szy-16
  • szy-17
  • szy-18
  • szy-19
  • unc-65

gp2protein File

  1. Need a version updated as often as possible - UniProtKB can upload nightly

Unsupported/Missing Reference

  • Including doi's instead would be fine, if available
  1. WBPaper00004663 - added doi in paper editor, will need to dump doi in gaf


  • Some GO references still refer to meeting abstracts
  1. WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation. DONE.
  2. WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. DONE. Added an IGI annotation with ced-3 from WBPaper00003815.
  3. WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. DONE. Added an IC annotation with GO:0043565 in WITH/FROM.
  4. WBPaper00018934 ceh-37 meeting abstract. R.


  • Annotations from P2GO pipeline that reference a paper's erratum, not the original paper
  1. WBPaper00006304 should be WBPaper00005637


  • Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
  1. WBPaper00005149 should be merged into WBPaper00005123


Back to Gene Ontology