UniProt-GOA syntax checking

From WormBaseWiki
Jump to navigationJump to search

Back to Gene Ontology

Error Output After Checking

  • Output from the GOA syntax checking script on each of the four curator files will be used to determine which annotations are removed from postgres and which will stay (e.g., miRNA annotations).

Errors to Fix

With/From Column

  • IC annotations need GO term in With/From - DONE
  • ISS annotations need database identifiers in With/From
  1. 917 ISS annotations in postgres as of 2012-07-23 - ~400 are legacy annotations without With/From entry
  2. Action - update as many as possible, but may need to move forward regardless; note that annotations would not be dumped/displayed.
  3. Action - retrieve from database all genes with an ISS and an experimental evidence code annotation to the same term
  • IMP annotations - some annotations use transgenes in the With/From column - this syntax should be okay, see jnk-1 and sir-2.1 for examples. Action - discuss with Rachael, Tony
  1. Review annotations to make sure they're still consistent with GO annotation practice
  2. Update transgene symbols to WB transgene identifiers - also true for fox-1 IGI
  3. Fixed sir-2.1 2012-10-05
  4. Fixed jnk-1 2012-10-05
  • IEP annotation - WBVariation ID in With column for an IEP annotation. Removed.
  • ISS annotation - With string not recognized or valid. act-2 - updated identifier in With column.
  • ISS annotation - With string not recognized or valid. dpy-27 - removed annotation; no IDA from other organism to ISS.
  • ISS annotation - With string not recognized or valid. eat-18. Removed annotation.
  • ISS annotation - With string not recognized or valid. egl-1. Removed annotation; other supporting experimental data.
  • ISS annotation - With string not recognized or valid. crn-3. Updated annotation to correct SGD identifier.
  • IGI annotation - With string not recognized or valid. fat-5 and fat-6 Updated annotations to correct SGD identifier.

Phenotype2GO Pipeline

  1. Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion
  2. Update syntax of WB Phenotypes in With/From column for Phenotype2GO-based IMP annotations

IEA Pipelines

  • UniProtKB will perform InterPro2GO mappings in-house
  • TMHMM-derived annotations need a resolvable accession in With/From column
  1. Is there an accession for TMHMM?
  2. If not, what could be used in place of this pipeline if we stand to lose annotations?
  3. Remove this mapping pipeline from WB. Keep TMHMM results in another database tag? Motif?

Annotations to ncRNAs

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
  2. CVS update -d
  • Annotated genes in file:
  1. mir-34
  2. mir-61
  3. mir-84
  4. mir-241
  5. lin-4
  6. lin-58
  7. let-7
  8. yrn-1

Annotations to Uncloned Genes

  • Need a specific mapping file for those genes
  1. Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
  • Genes affected (partial list):
  • abc-1
  • adp-1
  • cad-1
  • cat-6
  • cib-1
  • clk-3
  • cup-1
  • cup-3
  • cup-8
  • cup-9
  • cup-11
  • cyk-2
  • eat-1
  • eat-10
  • eos-1
  • eos-2
  • erf-1
  • erf-2
  • exc-1
  • exc-2
  • exc-3
  • exc-6
  • exc-8
  • hid-2
  • hid-4
  • let-760
  • pha-3
  • pre-1
  • pre-7
  • pre-33
  • ric-1
  • seu-2
  • seu-3
  • sex-2
  • sog-1
  • sog-2
  • sog-3
  • sog-4
  • sog-5
  • sog-6
  • sog-10
  • spe-3
  • spe-7
  • spe-13
  • szy-1
  • szy-2
  • szy-3
  • szy-5
  • szy-6
  • szy-7
  • szy-8
  • szy-9
  • szy-10
  • szy-11
  • szy-12
  • szy-13
  • szy-14
  • szy-15
  • szy-16
  • szy-17
  • szy-18
  • szy-19
  • unc-65
  • unc-74
  • unc-109

Annotations to Dead Genes

  • Do the dumping scripts filter out annotations made to now dead/invalid genes?
  • Corrections/Updates made:
  1. WBGene00004722 annotations merged into WBGene00011195/sao-1

gp2protein File

  • Need a version updated as often as possible to keep IDs as closely in sync as possible.
  • UniProtKB can upload file nightly.
  1. Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized
  • Updates for now:
  1. mig-21
  2. daf-16 - getting error message but IDs seem okay in gp2protein file
  3. lpr-7
  4. pat-12 - many new isoforms
  5. mog-2
  6. let-765
  7. fmi-1
  8. dex-1 = D1044.2

Unsupported/Missing Reference

  • Published papers without PMIDs - Including doi's instead would be fine, if available.
  1. WBPaper00004663 - added doi in paper editor, will need to dump doi in WB GAF
  2. WBPaper00003666 - rab-11.1 Missing PMID. Action - added PMID to paper in WB paper editor.
  3. WBPaper00005125 - sdc-2 No PMID, but doi. Annotation to secondary ID? R.
  4. WBPaper00000823 - sod-1 No PMID, but doi. Dump doi or add annotations to protein2go.

  • Some GO references still refer to meeting abstracts
  1. WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation.
  2. WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. Added an IGI annotation with ced-3 from WBPaper00003815.
  3. WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. Added an IC annotation with GO:0043565 in WITH/FROM.
  4. WBPaper00016619 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  5. WBPaper00018550 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
    1. Note - deleted all associated GO annotations for dpr-1, as evidence was based on unpublished results cited in Discussion of a paper.
  6. WBPaper00011270 flp-3 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
  7. WBPaper00019004 gcy-36 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
  8. WBPaper00015370 ggr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  9. WBPaper00015370 ggr-2 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  10. WBPaper00010338 glc-4 meeting abstract. Action - deleted annotation. Updates possible with WBPaper00034763 and WBPaper00041267.
  11. WBPaper00011328 gtl-1 meeting abstarct. Action - deleted annotation. Added IGI and IMP annotations from WBPaper00031549.
  12. WBPaper00019247 hda-4 meeting abstract. Action - deleted annotation. Updated annotations to WBPaper00028910.
  13. WBPaper00018626 him-1 meeting abstract. Action - deleted annotation. Annotations from papers already there.
  14. WBPaper00012310 iff-1 meeting abstract. Action - deleted annotation. Updates possible from WBPaper00006466?
  15. WBPaper00018980 jkk-1 meeting abstract. Action - deleted annotation. Possible literature updates?
  16. WBPaper00017664 kqt-2 meeting abstract. Action - deleted annotation. Updated one to WBPaper00031113 and other to WBPaper00025059.
  17. WBPaper00015206 lam-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
  18. WBPaper00010448 lat-1 meeting abstract. Action - deleted annotation. Updated from WBPaper00035406. Needs more work.
  19. WBPaper00011877 lec-11 meeting abstract. Action - deleted annotation. Could possibly be updated from literature?
  20. WBPaper00018850 lin-25 meeting abstract. Action - deleted annotation. Updated annotation already there.
  21. WBPaper00019638 klf-3 meeting abstract. Action - deleted annotation. Definitely needs updates from more recent papers.
  22. WBPaper00022992 klf-3 meeting abstract. Action - deleted annotation. Definitely needs updates from more recent papers.
  23. WBPaper00018062 pkc-1 meeting abstract. Action - retained annotations, but updated to published reference.
  24. WBPaper00019015 pkc-1 meeting abstract. Action - retained annotation, but updated to published reference.
  25. WBPaper00018615 rab-5 meeting abstract. Action - retained annotation, but updated to published reference.
  26. WBPaper00015788 rab-5 meeting abstract. Action - deleted annotation. Annotations from papers already there.
  27. WBPaper00018148 ram-2 meeting abstract. Action - updated P annotations to published reference; deleted F annotation - no reference.
  28. WBPaper00019612 ram-2 meeting abstract. Action - deleted annotations. No published information to support meeting abstract.
  29. WBPaper00011383 rap-1 meeting abstract. Action - retained annotation, but updated to published reference and added WITH for ISS.
  30. WBPaper00019034 rha-1 meeting abstract. Action - deleted C and one P annotations. No published information to support meeting abstract for these. Updated one P annotation to published reference.
  31. WBPaper00018735 rha-1 meeting abstract. Action - retained annotation, but updated to published reference.
  32. WBPaper00018011 syd-1 meeting abstract. Action - deleted annotation. Could possibly be updated from literature?
  33. WBPaper00012588 unc-104 meeting abstract. Action - deleted annotation. Other high quality function annotations exist.
  34. WBPaper00018934 ceh-37 meeting abstract. R. Action - retained annotation, updated to publication.
  35. WBPaper00011712 cpr-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
  36. WBPaper00022817 cpr-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
  37. WBPaper00011485 crh-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
  38. WBPaper00017392 crh-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.

Annotations from publications can be made, need to annotate this gene.

  1. WBPaper00018784 fat-7 meeting abstract. R. Action - deleted annotations, high-level terms, hgh quality annotations can be made, from publications.
  2. WBPaper00019580 fkh-6 meeting abstract. R. Action - deleted 1 P term, added annotation from published reference.
  3. WBPaper00022748 flr-4 meeting abstract. R. Action - deleted high level annotations, lack of published evidence. Possible ISS, will wait for SOP for ISS to be determined.
  4. WBPaper00011648 gbh-2 meeting abstract. R. Action - deleted 4 (2C, 1P, 1F) annotations due to lack of published evidence.
  5. WBPaper00018044 gei-4 meeting abstract. R. Action - deleted annotations due to lack of published evidence. Added 4 new annotations from published paper.
  6. WBPaper00018124 gei-4 meeting abstract. R. Action - deleted annotations due to lack of published ev
  7. WBPaper00018893 him-17 meeting abstract. R. Action - updated annotations to published reference.
  8. WBPaper00022917 ins-14 meeting abstract. R. Action - deleted annotations due to lack of published evidence.
  9. WBPaper00018355 inx-12 meeting abstract. R. Action - deleted annotations with no GO term, updated CC annotation to published reference.
  10. WBPaper00019572 itx-1 meeting abstract. R. Action - deleted annotations due to lack of published evidence.
  11. WBPaper00018039 klc-2 meeting abstract. R. Action -deleted annotations due to lack of published evidence; added new granular annotations from publication.
  12. WBPaper00010853 mir-35 meeting abstract. R. Action - deleted annotation due to lack of published evidence; updates possible to TAS annotations to experimental evidence codes.
  13. WBPaper00010853 mir-36 meeting abstract. R. Action - deleted annotation
  14. WBPaper00010853 mir-37 meeting abstract. R. Action - deleted annotation
  15. WBPaper00019534 mir-84 meeting abstract. R. Action - Deleted 1 P annotation, updated with several annotations from publication.
  16. WBPaper00018909 ptp-3 meeting abstract. R. Action - Deleted 1 BP annot; updated with 3 BP annotations from recent paper
  17. WBPaper00018756 scc-3 meeting abstract. R. Action - Deleted annotation, updates possible.
  18. WBPaper00016406 sup-17 meeting abstract. R. Action - Deleted annotation, updated with annotations from publication.
  19. WBPaper00017138 unc-58 meeting abstract. R. Action - Deleted annotation, updated with annotations from publication.
  20. WBPaper00018742 wve-1 meeting abstract. R. Action - Updated annotations to published paper, added several new annotations.
  21. WBPaper00011477 zag-1 meeting abstract. R. Action - Deleted annotation due to lack of published evidence.
  22. WBPaper00019643 zig-6 meeting abstract. R. Action - Deleted annotation due to lack of published evidence.

  • Annotations from P2GO pipeline or manual pipeline that reference a paper's erratum, not the original paper
  1. WBPaper00006304 updated to WBPaper00005637.
  2. WBPaper00004695 updated to WBPaper00004608. unc-78

  • Annotations from P2GO pipeline and CC annotation that reference a duplicate paper object for which there is no bibliographic information in WormBase
  1. WBPaper00005149 merged into WBPaper00005123

  • Annotation to incorrect paper - typo in paper ID.
  1. WBPaper00005379 updated to be WBPaper00005370. sek-1
  2. WBPaper00030350 updated to be WBPaper00031350. tip-1
  3. WBPaper00006538 updated to be WBPaper00006358. unc-27
  4. WBPaper00030369 updated to be WBPaper00031369. unc-41

Note about typos - these papers are getting flagged because they don't have a corresponding PMID - i.e. they aren't indexed by MEDLINE or they are meeting abstracts. If there were paper typos that didn't lead to flagging, how would we find them? One idea: search associated references with gene names via Textpresso and flag all those papers where the annotated gene names are not found in the paper.

Errors from GOC gaf checking on Jenkins

Date that curator checked on Jenkins [1] 10/10/2012

  • GO_AR:0000016 IC annotations require a With/From GO ID Warning count: 1
  1. WBGene00000517 cki-2 GO:0005634 RK Action-deleted annotation due to lack of published evidence.
  • GO_AR:0000018 IPI annotations require a With/From entry Warning count: 6
  1. WBGene00000265 brd-1 GO:0031436 RK Action- added appropriate entry to With/From
  2. WBGene00000815 csn-3 GO:0008180 RK Action:added entry to With/From
  3. WBGene00001564 icl-1 GO:0009790 RK Action:deleted, high level and non-specific
  4. WBGene00004808 skr-2 GO:0019005 RK Action:added entry to With/From
  5. WBGene00004816 skr-10 GO:0019005 RK Action:added entry to With/From
  6. WBGene00004887 smn-1 GO:0043621 RK Action:added entry to With/From

Back to Gene Ontology