Difference between revisions of "UniProt-GOA syntax checking"
From WormBaseWiki
Jump to navigationJump to searchLine 7: | Line 7: | ||
==Phenotype2GO Pipeline== | ==Phenotype2GO Pipeline== | ||
− | #Remove annotations mapping to | + | #Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion |
+ | |||
==IEA Pipelines== | ==IEA Pipelines== | ||
#UniProtKB will perform InterPro2GO mappings in-house | #UniProtKB will perform InterPro2GO mappings in-house |
Revision as of 13:40, 9 August 2012
Contents
Errors to Fix
With/From Column
- IC annotations need GO term in With/From - DONE
- ISS annotations need database identifiers in With/From
- 917 ISS annotations in postgres as of 2012-07-23
- Some annotations use transgenes in the WITH column - this syntax should be okay, see jnk-1 and sir-2.1 for examples
Phenotype2GO Pipeline
- Remove annotations mapping to ncRNAs (e.g. 21U RNAs) and check again for pseudogene exclusion
IEA Pipelines
- UniProtKB will perform InterPro2GO mappings in-house
- TMHMM-derived annotations need a resolvable accession in With/From column
- Is there an accession for TMHMM?
- If not, what could be used in place of this pipeline if we stand to lose annotations?
Annotations to ncRNAs
- Need a specific mapping file for those genes
- Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
Annotations to Uncloned Genes
- Need a specific mapping file for those genes
- Action - contacted Rama to see if appropriate directories can be set up in GO CVS/SVN. Passed on to Mike C.
- Genes affected (partial list):
- cad-1
- exc-1
- exc-2
- exc-3
- exc-6
- exc-8
- hid-2
- hid-4
- ric-1
- seu-2
- seu-3
- sog-1
- sog-2
- sog-3
- sog-4
- sog-5
- sog-6
- sog-10
- szy-1
- szy-2
- szy-3
- szy-4
- szy-5
- szy-6
- szy-7
- szy-8
- szy-9
- szy-10
- szy-11
- szy-12
- szy-13
- szy-14
- szy-15
- szy-16
- szy-17
- szy-18
- szy-19
- unc-65
gp2protein File
- Need a version updated as often as possible to keep IDs as closely in sync as possible.
- UniProtKB can upload file nightly.
- Action - need to develop a pipeline for more frequent updates of WB gp2protein file, as well as gp2ncRNA and gp2unlocalized
Unsupported/Missing Reference
- Published papers without PMIDs - Including doi's instead would be fine, if available.
- WBPaper00004663 - added doi in paper editor, will need to dump doi in WB GAF
- Some GO references still refer to meeting abstracts
- WBPaper00011144 ced-11 meeting abstract. Action - deleted annotation. DONE.
- WBPaper00022068 ced-11 meeting abstract. Action - deleted annotation. DONE. Added an IGI annotation with ced-3 from WBPaper00003815.
- WBPaper00011088 ces-1 meeting abstract. Action - deleted annotation. DONE. Added an IC annotation with GO:0043565 in WITH/FROM.
- WBPaper00018934 ceh-37 meeting abstract. R.
- Annotations from P2GO pipeline that reference a paper's erratum, not the original paper
- WBPaper00006304 should be WBPaper00005637
- Annotations from P2GO pipeline that reference a duplicate paper object for which there is no bibliographic information in WormBase
- WBPaper00005149 should be merged into WBPaper00005123
Back to Gene Ontology