UniProt-GOA syntax checking

From WormBaseWiki
Jump to navigationJump to search

Errors to Fix

With/From Column

  • IC annotations need GO term in With/From - DONE
  • ISS annotations need database identifiers in With/From
  1. 917 ISS annotations in postgres as of 2012-07-23 - ~400 are legacy annotations without With/From entry
  2. Action - update as many as possible, but may need to move forward regardless; note that annotations would not be dumped/displayed.
  3. Action - retrieve from database all genes with an ISS and an experimental evidence code annotation to the same term
  • IMP annotations - some annotations use transgenes in the With/From column - this syntax should be okay, see jnk-1 and sir-2.1 for examples. Action - discuss with Rachael, Tony
  1. Review annotations to make sure they're still consistent with GO annotation practice
  2. Update transgene symbols to WB transgene identifiers - also true for fox-1 IGI
  • IEP annotation - WBVariation ID in With column for an IEP annotation. Removed.
  • ISS annotation - With string not recognized or valid. act-2 - updated identifier in With column.
  • ISS annotation - With string not recognized or valid. dpy-27 - removed annotation; no IDA from other organism to ISS.
  • ISS annotation - With string not recognized or valid. eat-18. Removed annotation.
  • ISS annotation - With string not recognized or valid. egl-1. Removed annotation; other supporting experimental data.
  • ISS annotation - With string not recognized or valid. crn-3. R.
  • IGI annotation - With string not recognized or valid. fat-5 and fat-6 Updated annotations to correct SGD identifier.

Phenotype2GO Pipeline

  1. Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion
  2. Update syntax of WB Phenotypes in With/From column for Phenotype2GO-based IMP annotations

IEA Pipelines

  • UniProtKB will perform InterPro2GO mappings in-house
  • TMHMM-derived annotations need a resolvable accession in With/From column
  1. Is there an accession for TMHMM?
  2. If not, what could be used in place of this pipeline if we stand to lose annotations?
  3. Remove this mapping pipeline from WB. Keep TMHMM results in another database tag? Motif?

Annotations to ncRNAs

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
  2. CVS update -d
  • Annotated genes in file:
  1. mir-34

Annotations to Uncloned Genes

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
  • Genes affected (partial list):
  • abc-1
  • adp-1
  • cad-1
  • cat-6
  • cib-1
  • cup-1
  • cup-3
  • cup-8
  • cup-9
  • cup-11
  • cyk-2
  • exc-1
  • exc-2
  • exc-3
  • exc-6
  • exc-8
  • hid-2
  • hid-4
  • ric-1
  • seu-2
  • seu-3
  • sog-1
  • sog-2
  • sog-3
  • sog-4
  • sog-5
  • sog-6
  • sog-10
  • szy-1
  • szy-2
  • szy-3
  • szy-5
  • szy-6
  • szy-7
  • szy-8
  • szy-9
  • szy-10
  • szy-11
  • szy-12
  • szy-13
  • szy-14
  • szy-15
  • szy-16
  • szy-17
  • szy-18
  • szy-19
  • unc-65

Annotations to Dead Genes

  • Do the dumping scripts filter out annotations made to now dead/invalid genes?
  • Corrections/Updates made:
  1. WBGene00004722 annotations merged into WBGene00011195/sao-1

gp2protein File

  • Need a version updated as often as possible to keep IDs as closely in sync as possible.
  • UniProtKB can upload file nightly.
  1. Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized
  • Updates for now:
  1. mig-21
  2. daf-16 - getting error message but IDs seem okay in gp2protein file
  3. lpr-7
  4. pat-12 - many new isoforms
  5. mog-2
  6. let-765
  7. fmi-1

Unsupported/Missing Reference

  • Published papers without PMIDs - Including doi's instead would be fine, if available.
  1. WBPaper00004663 - added doi in paper editor, will need to dump doi in WB GAF
  2. WBPaper00003666 - rab-11.1 Missing PMID. Action - added PMID to paper in WB paper editor.
  3. WBPaper00005125 - sdc-2 No PMID, but doi. Annotation to secondary ID? R.
  4. WBPaper00000823 - sod-1 No PMID, but doi. Dump doi or add annotations to protein2go.


  • Some GO references still refer to meeting abstracts
  1. WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation.
  2. WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. Added an IGI annotation with ced-3 from WBPaper00003815.
  3. WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. Added an IC annotation with GO:0043565 in WITH/FROM.
  4. WBPaper00016619 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  5. WBPaper00018550 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
    1. Note - deleted all associated GO annotations for dpr-1, as evidence was based on unpublished results cited in Discussion of a paper.
  6. WBPaper00011270 flp-3 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
  7. WBPaper00019004 gcy-36 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
  8. WBPaper00015370 ggr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  9. WBPaper00015370 ggr-2 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  10. WBPaper00010338 glc-4 meeting abstract. Action - deleted annotation. Updates possible with WBPaper00034763 and WBPaper00041267.
  11. WBPaper00011328 gtl-1 meeting abstarct. Action - deleted annotation. Added IGI and IMP annotations from WBPaper00031549.
  12. WBPaper00019247 hda-4 meeting abstract. Action - deleted annotation. Updated annotations to WBPaper00028910.
  13. WBPaper00018626 him-1 meeting abstract. Action - deleted annotation. Annotations from papers already there.
  14. WBPaper00012310 iff-1 meeting abstract. Action - deleted annotation. Updates possible from WBPaper00006466?
  15. WBPaper00018980 jkk-1 meeting abstract. Action - deleted annotation. Possible literature updates?
  16. WBPaper00017664 kqt-2 meeting abstract. Action - deleted annotation. Updated one to WBPaper00031113 and other to WBPaper00025059.
  17. WBPaper00015206 lam-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  18. WBPaper00010448 lat-1 meeting abstract. Action - deleted annotation. Updated from WBPaper00035406. Needs more work.
  19. WBPaper00011877 lec-11 meeting abstract. Action - deleted annotation. Could possibly be updated from literature?
  20. WBPaper00018850 lin-25 meeting abstract. Action - deleted annotation. Updated annotation already there.
  21. WBPaper00019638 klf-3 meeting abstract. Action - deleted annotation. Definitely needs updates from more recent papers.
  22. WBPaper00022992 klf-3 meeting abstract. Action - deleted annotation. Definitely needs updates from more recent papers.
  23. WBPaper00018062 pkc-1 meeting abstract. Action - retained annotations, but updated to published reference.
  24. WBPaper00019015 pkc-1 meeting abstract. Action - retained annotation, but updated to published reference.
  25. WBPaper00018615 rab-5 meeting abstract. Action - retained annotation, but updated to published reference.
  26. WBPaper00015788 rab-5 meeting abstract. Action - deleted annotation. Annotations from papers already there.
  27. WBPaper00018148 ram-2 meeting abstract. Action - updated P annotations to published reference; deleted F annotation - no reference.
  28. WBPaper00019612 ram-2 meeting abstract. Action - deleted annotations. No published information to support meeting abstract.
  29. WBPaper00011383 rap-1 meeting abstract. Action - retained annotation, but updated to published reference and added WITH for ISS.
  30. WBPaper00019034 rha-1 meeting abstract. Action - deleted C and one P annotations. No published information to support meeting abstract for these. Updated one P annotation to published reference.
  31. WBPaper00018735 rha-1 meeting abstract. Action - retained annotation, but updated to published reference.
  32. WBPaper00018011 syd-1 meeting abstract. Action - deleted annotation. Could possibly be updated from literature?
  33. WBPaper00012588 unc-104 meeting abstract. Action - deleted annotation. Other high quality function annotations exist.
  34. WBPaper00018934 ceh-37 meeting abstract. R.
  35. WBPaper00011712 cpr-1 meeting abstract. R.
  36. WBPaper00022817 cpr-1 meeting abstract. R.
  37. WBPaper00011485 crh-1 meeting abstract. R.
  38. WBPaper00017392 crh-1 meeting abstract. R.
  39. WBPaper00018784 fat-7 meeting abstract. R.
  40. WBPaper00019580 fkh-6 meeting abstract. R.
  41. WBPaper00022748 flr-4 meeting abstract. R.
  42. WBPaper00011648 gbh-2 meeting abstract. R.
  43. WBPaper00018044 gei-4 meeting abstract. R.
  44. WBPaper00018124 gei-4 meeting abstract. R.
  45. WBPaper00018893 him-17 meeting abstract. R.
  46. WBPaper00022917 ins-14 meeting abstract. R.
  47. WBPaper00018355 inx-12 meeting abstract. R. Also check to replace missing term from now secondary GO ids?
  48. WBPaper00019572 itx-1 meeting abstract. R.
  49. WBPaper00018039 klc-2 meeting abstract. R.
  50. WBPaper00010853 mir-35, mir-36, mir-37 meeting abstract. R.
  51. WBPaper00019534 mir-84 meeting abstract. R.
  52. WBPaper00018909 ptp-3 meeting abstract. R.
  53. WBPaper00018756 scc-3 meeting abstract. R.
  54. WBPaper00016406 sup-17 meeting abstract. R.
  55. WBPaper00017138 unc-58 meeting abstract. R.
  56. WBPaper00018742 wve-1 meeting abstract. R.
  57. WBPaper00011477 zag-1 meeting abstract. R.
  58. WBPaper00019643 zig-6 meeting abstract. R.


  • Annotations from P2GO pipeline or manual pipeline that reference a paper's erratum, not the original paper
  1. WBPaper00006304 updated to WBPaper00005637.
  2. WBPaper00004695 updated to WBPaper00004608. unc-78


  • Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
  1. WBPaper00005149 should be merged into WBPaper00005123


  • Annotation to incorrect paper - typo in paper ID.
  1. WBPaper00005379 updated to be WBPaper00005370. sek-1
  2. WBPaper00030350 updated to be WBPaper00031350. tip-1
  3. WBPaper00006538 updated to be WBPaper00006358. unc-27
  4. WBPaper00030369 updated to be WBPaper00031369. unc-41

Note about typos - these papers are getting flagged because they don't have a corresponding PMID - i.e. they aren't indexed by MEDLINE or they are meeting abstracts. If there were paper typos that didn't lead to flagging, how would we find them? One idea: search associated references with gene names via Textpresso and flag all those papers where the annotated gene names are not found in the paper.


Back to Gene Ontology