UniProt-GOA syntax checking
From WormBaseWiki
Revision as of 14:51, 10 August 2012 by Vanaukenk (talk | contribs) (→Unsupported/Missing Reference)
Contents
Errors to Fix
With/From Column
- IC annotations need GO term in With/From - DONE
- ISS annotations need database identifiers in With/From
- 917 ISS annotations in postgres as of 2012-07-23 - ~400 are legacy annotations without With/From entry
- Action - update as many as possible, but may need to move forward regardless; note that annotations would not be dumped/displayed.
- IMP annotations - some annotations use transgenes in the With/From column - this syntax should be okay, see jnk-1 and sir-2.1 for examples. Action - discuss with Rachael, Tony
- Review annotations to make sure they're still consistent with GO annotation practice
- Update transgene symbols to WB transgene identifiers
Phenotype2GO Pipeline
- Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion
- Update syntax of WB Phenotypes in With/From column for Phenotype2GO-based IMP annotations
IEA Pipelines
- UniProtKB will perform InterPro2GO mappings in-house
- TMHMM-derived annotations need a resolvable accession in With/From column
- Is there an accession for TMHMM?
- If not, what could be used in place of this pipeline if we stand to lose annotations?
- Remove this mapping pipeline from WB. Keep TMHMM results in another database tag? Motif?
Annotations to ncRNAs
- Need a specific mapping file for those genes
- Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
- CVS update -d
Annotations to Uncloned Genes
- Need a specific mapping file for those genes
- Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
- Genes affected (partial list):
- cad-1
- exc-1
- exc-2
- exc-3
- exc-6
- exc-8
- hid-2
- hid-4
- ric-1
- seu-2
- seu-3
- sog-1
- sog-2
- sog-3
- sog-4
- sog-5
- sog-6
- sog-10
- szy-1
- szy-2
- szy-3
- szy-4
- szy-5
- szy-6
- szy-7
- szy-8
- szy-9
- szy-10
- szy-11
- szy-12
- szy-13
- szy-14
- szy-15
- szy-16
- szy-17
- szy-18
- szy-19
- unc-65
gp2protein File
- Need a version updated as often as possible to keep IDs as closely in sync as possible.
- UniProtKB can upload file nightly.
- Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized
Unsupported/Missing Reference
- Published papers without PMIDs - Including doi's instead would be fine, if available.
- WBPaper00004663 - added doi in paper editor, will need to dump doi in WB GAF
- Some GO references still refer to meeting abstracts
- WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation.
- WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. Added an IGI annotation with ced-3 from WBPaper00003815.
- WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. Added an IC annotation with GO:0043565 in WITH/FROM.
- WBPaper00016619 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00018550 dpr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- Note - deleted all associated GO annotations for dpr-1, as evidence was based on unpublished results cited in Discussion of a paper.
- WBPaper00011270 flp-3 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
- WBPaper00019004 gcy-36 meeting abstract. Action - deleted annotation and updated with annotation from published paper.
- WBPaper00015370 ggr-1 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00015370 ggr-2 meeting abstract. Action - deleted annotation. No published information to support meeting abstract.
- WBPaper00010338 glc-4 meeting abstract. Action - deleted annotation. Updates possible with WBPaper00034763 and WBPaper00041267.
- WBPaper00011328 gtl-1 meeting abstarct. Action -deleted annotation. Added IGI and IMP annotations from WBPaper00031549.
- WBPaper00018934 ceh-37 meeting abstract. R.
- WBPaper00011712 cpr-1 meeting abstract. R.
- WBPaper00022817 cpr-1 meeting abstract. R.
- WBPaper00011485 crh-1 meeting abstract. R.
- WBPaper00017392 crh-1 meeting abstract. R.
- WBPaper00018784 fat-7 meeting abstract. R.
- WBPaper00019580 fkh-6 meeting abstract. R.
- WBPaper00022748 flr-4 meeting abstract. R.
- WBPaper00011648 gbh-2 meeting abstract. R.
- WBPaper00018044 gei-4 meeting abstract. R.
- WBPaper00018124 gei-4 meeting abstract. R.
- Annotations from P2GO pipeline that reference a paper's erratum, not the original paper
- WBPaper00006304 should be WBPaper00005637
- Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
- WBPaper00005149 should be merged into WBPaper00005123
Back to Gene Ontology