Comparison CGI

From WormBaseWiki
Jump to navigationJump to search

Comparative analysis

https://docs.google.com/spreadsheets/d/1QnRd0TE7zZGC4ThQvLX1d-9sQH2luqrYLb_A26o8WCE/edit#gid=579706084

We analyzed manually ~200 papers but we need to do that programmatically as doing it manually is inefficient

output

We need to compare the gp_annotation.ace file and OA expression tables

for each 
Reference	"WBPaper000nnnnn”
in gp_annotation.ace

output a file that has:

1) count and WBpaper list of papers that are in gp_annotation.ace and are NOT in exp_paper that have exp_goid NOT null.

Example: 
56 WBpapers in gp_annotation.ace that are not in ExprOA:
WBPaper00000076
WBPaper00003456
WBPaper00023456
WBPaper00056789
..

‘these are the papers that have been curated for GO_CC and not in ExprOA_CC’


2) count and WBpaper list of papers that have exp_goid NOT null and are not in the gp_annotation.ace

Example: 
nn WBpapers in ExprOA not in gp_annotation.ace:
WBPaper00000076
WBPaper00003456
WBPaper00023456
WBPaper00056789
..

‘these are the papers that have been curated for ExprOA_CC and not in GO_CC’


3) for papers that are in gp_annotation.ace (in Reference "WBPaper000nnnnn”) that have matching  exp_paper IDs for which exp_goid IS NOT null

*compare ‘GO_term "GO:000nnnn”’ in the gp_annotation.ace and exp_goid in ExprOA and output:

a) count and WBpaper list of papers that have matching GO annotations
b) count and WBpaper list of papers that have NOT matching GO annotations

Example matching: 
nn WBpapers in ExprOA and in gp_annotation.ace with matching annotations:
WBPaper00000076
WBPaper00003456
WBPaper00023456
WBPaper00056789
…

Example NOT matching: 
nn WBpapers in ExprOA and in gp_annotation.ace with non-matching annotations:
WBPaper00000098: gene1 LegoCC <‘GO_term "GO:000nnnn”’>; ExprOACC <exp_goid>
WBPaper00000098: gene2 LegoCC <‘GO_term "GO:000nnnn”’>; ExprOACC <null>
WBPaper00000098: gene3 LegoCC <null>; ExprOACC <exp_goid>

for the genes that have lego GO_CC and no GO term annotation in OA we should import (only IDA) objects that should be further evaluated by Expression curator to add transgene antibody info, should be labeled with Kimberly as curator and have a no dump.