UniProt-GOA syntax checking
From WormBaseWiki
Contents
Errors to Fix
With/From Column
- IC annotations need GO term in With/From - DONE
- ISS annotations need database identifiers in With/From
- 917 ISS annotations in postgres as of 2012-07-23 - ~400 are legacy annotations without With/From entry
- Action - update as many as possible, but may need to move forward regardless; note that annotations would not be dumped/displayed.
- Action - retrieve from database all genes with an ISS and an experimental evidence code annotation to the same term
- IMP annotations - some annotations use transgenes in the With/From column - this syntax should be okay, see jnk-1 and sir-2.1 for examples. Action - discuss with Rachael, Tony
- Review annotations to make sure they're still consistent with GO annotation practice
- Update transgene symbols to WB transgene identifiers - also true for fox-1 IGI
- IEP annotation - WBVariation ID in With column for an IEP annotation. Removed.
- ISS annotation - With string not recognized or valid. act-2 - updated identifier in With column.
- ISS annotation - With string not recognized or valid. dpy-27 - removed annotation; no IDA from other organism to ISS.
- ISS annotation - With string not recognized or valid. eat-18. Removed annotation.
- ISS annotation - With string not recognized or valid. egl-1. Removed annotation; other supporting experimental data.
- ISS annotation - With string not recognized or valid. crn-3. Updated annotation to correct SGD identifier.
- IGI annotation - With string not recognized or valid. fat-5 and fat-6 Updated annotations to correct SGD identifier.
Phenotype2GO Pipeline
- Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion
- Update syntax of WB Phenotypes in With/From column for Phenotype2GO-based IMP annotations
IEA Pipelines
- UniProtKB will perform InterPro2GO mappings in-house
- TMHMM-derived annotations need a resolvable accession in With/From column
- Is there an accession for TMHMM?
- If not, what could be used in place of this pipeline if we stand to lose annotations?
- Remove this mapping pipeline from WB. Keep TMHMM results in another database tag? Motif?
Annotations to ncRNAs
- Need a specific mapping file for those genes
- Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
- CVS update -d
- Annotated genes in file:
- mir-34
Annotations to Uncloned Genes
- Need a specific mapping file for those genes
- Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
- Genes affected (partial list):
- abc-1
- adp-1
- cad-1
- cat-6
- cib-1
- cup-1
- cup-3
- cup-8
- cup-9
- cup-11
- cyk-2
- exc-1
- exc-2
- exc-3
- exc-6
- exc-8
- hid-2
- hid-4
- ric-1
- seu-2
- seu-3
- sog-1
- sog-2
- sog-3
- sog-4
- sog-5
- sog-6
- sog-10
- szy-1
- szy-2
- szy-3
- szy-5
- szy-6
- szy-7
- szy-8
- szy-9
- szy-10
- szy-11
- szy-12
- szy-13
- szy-14
- szy-15
- szy-16
- szy-17
- szy-18
- szy-19
- unc-65
Annotations to Dead Genes
- Do the dumping scripts filter out annotations made to now dead/invalid genes?
- Corrections/Updates made:
- WBGene00004722 annotations merged into WBGene00011195/sao-1
gp2protein File
- Need a version updated as often as possible to keep IDs as closely in sync as possible.
- UniProtKB can upload file nightly.
- Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized
- Updates for now:
- mig-21
- daf-16 - getting error message but IDs seem okay in gp2protein file
- lpr-7
- pat-12 - many new isoforms
- mog-2
- let-765
- fmi-1
- dex-1 = D1044.2
Unsupported/Missing Reference
- Published papers without PMIDs - Including doi's instead would be fine, if available.
- WBPaper00004663 - added doi in paper editor, will need to dump doi in WB GAF
- WBPaper00003666 - rab-11.1 Missing PMID. Action - added PMID to paper in WB paper editor.
- WBPaper00005125 - sdc-2 No PMID, but doi. Annotation to secondary ID? R.
- WBPaper00000823 - sod-1 No PMID, but doi. Dump doi or add annotations to protein2go.
- Some GO references still refer to meeting abstracts
- WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation.
- WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. Added an IGI annotation with ced-3 from WBPaper00003815.
- WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. Added an IC annotation with GO:0043565 in WITH/FROM.
- WBPaper00016619 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00018550 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- Note - deleted all associated GO annotations for dpr-1, as evidence was based on unpublished results cited in Discussion of a paper.
- WBPaper00011270 flp-3 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
- WBPaper00019004 gcy-36 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
- WBPaper00015370 ggr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00015370 ggr-2 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00010338 glc-4 meeting abstract. Action - deleted annotation. Updates possible with WBPaper00034763 and WBPaper00041267.
- WBPaper00011328 gtl-1 meeting abstarct. Action - deleted annotation. Added IGI and IMP annotations from WBPaper00031549.
- WBPaper00019247 hda-4 meeting abstract. Action - deleted annotation. Updated annotations to WBPaper00028910.
- WBPaper00018626 him-1 meeting abstract. Action - deleted annotation. Annotations from papers already there.
- WBPaper00012310 iff-1 meeting abstract. Action - deleted annotation. Updates possible from WBPaper00006466?
- WBPaper00018980 jkk-1 meeting abstract. Action - deleted annotation. Possible literature updates?
- WBPaper00017664 kqt-2 meeting abstract. Action - deleted annotation. Updated one to WBPaper00031113 and other to WBPaper00025059.
- WBPaper00015206 lam-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00010448 lat-1 meeting abstract. Action - deleted annotation. Updated from WBPaper00035406. Needs more work.
- WBPaper00011877 lec-11 meeting abstract. Action - deleted annotation. Could possibly be updated from literature?
- WBPaper00018850 lin-25 meeting abstract. Action - deleted annotation. Updated annotation already there.
- WBPaper00019638 klf-3 meeting abstract. Action - deleted annotation. Definitely needs updates from more recent papers.
- WBPaper00022992 klf-3 meeting abstract. Action - deleted annotation. Definitely needs updates from more recent papers.
- WBPaper00018062 pkc-1 meeting abstract. Action - retained annotations, but updated to published reference.
- WBPaper00019015 pkc-1 meeting abstract. Action - retained annotation, but updated to published reference.
- WBPaper00018615 rab-5 meeting abstract. Action - retained annotation, but updated to published reference.
- WBPaper00015788 rab-5 meeting abstract. Action - deleted annotation. Annotations from papers already there.
- WBPaper00018148 ram-2 meeting abstract. Action - updated P annotations to published reference; deleted F annotation - no reference.
- WBPaper00019612 ram-2 meeting abstract. Action - deleted annotations. No published information to support meeting abstract.
- WBPaper00011383 rap-1 meeting abstract. Action - retained annotation, but updated to published reference and added WITH for ISS.
- WBPaper00019034 rha-1 meeting abstract. Action - deleted C and one P annotations. No published information to support meeting abstract for these. Updated one P annotation to published reference.
- WBPaper00018735 rha-1 meeting abstract. Action - retained annotation, but updated to published reference.
- WBPaper00018011 syd-1 meeting abstract. Action - deleted annotation. Could possibly be updated from literature?
- WBPaper00012588 unc-104 meeting abstract. Action - deleted annotation. Other high quality function annotations exist.
- WBPaper00018934 ceh-37 meeting abstract. R. Action - retained annotation, updated to publication.
- WBPaper00011712 cpr-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
- WBPaper00022817 cpr-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
- WBPaper00011485 crh-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
- WBPaper00017392 crh-1 meeting abstract. R. Action - deleted annotation, lack of published evidence.
Annotations from publications can be made, need to annotate this gene.
- WBPaper00018784 fat-7 meeting abstract. R. Action - deleted annotations, high-level terms, hgh quality annotations can be made, from publications.
- WBPaper00019580 fkh-6 meeting abstract. R. Action - deleted 1 P term, added annotation from published reference.
- WBPaper00022748 flr-4 meeting abstract. R. Action - deleted high level annotations, lack of published evidence. Possible ISS, will wait for SOP for ISS to be determined.
- WBPaper00011648 gbh-2 meeting abstract. R. Action - deleted 4 (2C, 1P, 1F) annotations were from these 2 meeting abstracts, deleted all due to lack of evidence in publications; gene has similar InterPro2GO annots
- WBPaper00018044 gei-4 meeting abstract. R.
- WBPaper00018124 gei-4 meeting abstract. R.
- WBPaper00018893 him-17 meeting abstract. R.
- WBPaper00022917 ins-14 meeting abstract. R.
- WBPaper00018355 inx-12 meeting abstract. R. Also check to replace missing term from now secondary GO ids?
- WBPaper00019572 itx-1 meeting abstract. R.
- WBPaper00018039 klc-2 meeting abstract. R.
- WBPaper00010853 mir-35, mir-36, mir-37 meeting abstract. R.
- WBPaper00019534 mir-84 meeting abstract. R.
- WBPaper00018909 ptp-3 meeting abstract. R.
- WBPaper00018756 scc-3 meeting abstract. R.
- WBPaper00016406 sup-17 meeting abstract. R.
- WBPaper00017138 unc-58 meeting abstract. R.
- WBPaper00018742 wve-1 meeting abstract. R.
- WBPaper00011477 zag-1 meeting abstract. R.
- WBPaper00019643 zig-6 meeting abstract. R.
- Annotations from P2GO pipeline or manual pipeline that reference a paper's erratum, not the original paper
- WBPaper00006304 updated to WBPaper00005637.
- WBPaper00004695 updated to WBPaper00004608. unc-78
- Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
- WBPaper00005149 should be merged into WBPaper00005123
- Annotation to incorrect paper - typo in paper ID.
- WBPaper00005379 updated to be WBPaper00005370. sek-1
- WBPaper00030350 updated to be WBPaper00031350. tip-1
- WBPaper00006538 updated to be WBPaper00006358. unc-27
- WBPaper00030369 updated to be WBPaper00031369. unc-41
Note about typos - these papers are getting flagged because they don't have a corresponding PMID - i.e. they aren't indexed by MEDLINE or they are meeting abstracts. If there were paper typos that didn't lead to flagging, how would we find them? One idea: search associated references with gene names via Textpresso and flag all those papers where the annotated gene names are not found in the paper.
Back to Gene Ontology