Difference between revisions of "New GO Progress Report Script"
From WormBaseWiki
Jump to navigationJump to searchLine 5: | Line 5: | ||
Here's one idea for C. elegans manual annotations: | Here's one idea for C. elegans manual annotations: | ||
− | Input files: | + | Input files (available on tazendra in home/acedb/ranjana/GO/Progress_Reports/Test): |
*gp2protein.wb | *gp2protein.wb | ||
*gp_association.wb | *gp_association.wb |
Revision as of 18:59, 6 March 2014
GO is now requiring quarterly progress reports, with the first one due at the meeting this month (2014-03-16).
We've been wanting to provide a more details progress report for GO for some time now, so this is a good opportunity to do that.
Here's one idea for C. elegans manual annotations:
Input files (available on tazendra in home/acedb/ranjana/GO/Progress_Reports/Test):
- gp2protein.wb
- gp_association.wb
- Ignore all lines with IEA evidence code
- Replace UniProtKB identifiers in Column 2 with WBGene ID using gp2protein.wb
- Remove (i.e. ignore for further reporting) any resulting lines that are exact duplicate lines of annotation
Then determine:
- Total number of unique annotations
- Total number of unique WBGenes
- For each of the values in qualifier Column 4 count number of annotations for a given evidence code in Column 7 and number of annotations with an entry in Column 12
- Sort results according to unique entries in Column 10 (i.e., each contributing group)
- Also report on any lines where the UniProtKB identifier cannot be converted to a WBGene
For an idea of the kind of table I'm hoping to be able to produce, here's a link to Zfin's progress report from December 2013:
http://wiki.geneontology.org/index.php/ZFIN_December_2013#Annotation_Progress