Difference between revisions of "Comparison CGI"

From WormBaseWiki
Jump to navigationJump to search
Line 3: Line 3:
 
https://docs.google.com/spreadsheets/d/1QnRd0TE7zZGC4ThQvLX1d-9sQH2luqrYLb_A26o8WCE/edit#gid=579706084
 
https://docs.google.com/spreadsheets/d/1QnRd0TE7zZGC4ThQvLX1d-9sQH2luqrYLb_A26o8WCE/edit#gid=579706084
  
=Tables=
+
=output=
 +
<pre>
 +
We need to compare the gp_annotation.ace file and OA expression tables
  
==1) Comparison table==
+
for each
For each WBPaper, list
+
Reference "WBPaper000nnnnn”
Gene present both in lego cc and OA curated for GO term
+
in gp_annotation.ace
  
WBPaper
+
output a file that has:
Gene1| legocc GO terms| OA GO term
 
Gene2| legocc GO terms| OA GO term
 
  
==2) Table that has only GO for legoCC==
+
1) count and WBpaper list of papers that are in gp_annotation.ace and are NOT in exp_paper that have exp_goid NOT null.
  
WBPaper
+
Example:
Gene1| legocc GO terms| OA blank
+
56 WBpapers in gp_annotation.ace that are not in ExprOA:
Gene2| legocc GO terms| OA blank
+
WBPaper00000076
 +
WBPaper00003456
 +
WBPaper00023456
 +
WBPaper00056789
 +
..
  
==3) Table that has only OA for legoCC==
+
‘these are the papers that have been curated for GO_CC and not in ExprOA_CC’
Gene1| legocc blank| OA GO terms
 
  
  
for the genes that have lego GO_CC and no GO term annotation in OA we should import (only IDA) objects that should be further evaluated by Expression curator to add transgene antibody info, should be labeled with Kimberly as curator and have a no dump.
+
2) count and WBpaper list of papers that have exp_goid NOT null and are not in the gp_annotation.ace
 +
 
 +
Example:
 +
nn WBpapers in ExprOA not in gp_annotation.ace:
 +
WBPaper00000076
 +
WBPaper00003456
 +
WBPaper00023456
 +
WBPaper00056789
 +
..
 +
 
 +
‘these are the papers that have been curated for ExprOA_CC and not in GO_CC’
 +
 
 +
 
 +
3) for papers that are in gp_annotation.ace (in Reference "WBPaper000nnnnn”) that have matching  exp_paper IDs for which exp_goid IS NOT null
 +
 
 +
*compare ‘GO_term "GO:000nnnn”’ in the gp_annotation.ace and exp_goid in ExprOA and output:
 +
 
 +
a) count and WBpaper list of papers that have matching GO annotations
 +
b) count and WBpaper list of papers that have NOT matching GO annotations
 +
 
 +
Example matching:
 +
nn WBpapers in ExprOA and in gp_annotation.ace with matching annotations:
 +
WBPaper00000076
 +
WBPaper00003456
 +
WBPaper00023456
 +
WBPaper00056789
 +
 +
 
 +
Example NOT matching:
 +
nn WBpapers in ExprOA and in gp_annotation.ace with non-matching annotations:
 +
WBPaper00000098: gene1 LegoCC <‘GO_term "GO:000nnnn”’>; ExprOACC <exp_goid>
 +
WBPaper00000098: gene2 LegoCC <‘GO_term "GO:000nnnn”’>; ExprOACC <null>
 +
WBPaper00000098: gene3 LegoCC <null>; ExprOACC <exp_goid>
  
== Stats ==
+
</pre>
  
*number of papers curated for OA GO_term and not for legocc
+
for the genes that have lego GO_CC and no GO term annotation in OA we should import (only IDA) objects that should be further evaluated by Expression curator to add transgene antibody info, should be labeled with Kimberly as curator and have a no dump.
*number of papers curated for legocc and not for OA GO_term
 
*number of papers that have matching annotations
 
*number of papers that have discrepancies
 

Revision as of 20:48, 10 February 2017

Comparative analysis

https://docs.google.com/spreadsheets/d/1QnRd0TE7zZGC4ThQvLX1d-9sQH2luqrYLb_A26o8WCE/edit#gid=579706084

output

We need to compare the gp_annotation.ace file and OA expression tables

for each 
Reference	"WBPaper000nnnnn”
in gp_annotation.ace

output a file that has:

1) count and WBpaper list of papers that are in gp_annotation.ace and are NOT in exp_paper that have exp_goid NOT null.

Example: 
56 WBpapers in gp_annotation.ace that are not in ExprOA:
WBPaper00000076
WBPaper00003456
WBPaper00023456
WBPaper00056789
..

‘these are the papers that have been curated for GO_CC and not in ExprOA_CC’


2) count and WBpaper list of papers that have exp_goid NOT null and are not in the gp_annotation.ace

Example: 
nn WBpapers in ExprOA not in gp_annotation.ace:
WBPaper00000076
WBPaper00003456
WBPaper00023456
WBPaper00056789
..

‘these are the papers that have been curated for ExprOA_CC and not in GO_CC’


3) for papers that are in gp_annotation.ace (in Reference "WBPaper000nnnnn”) that have matching  exp_paper IDs for which exp_goid IS NOT null

*compare ‘GO_term "GO:000nnnn”’ in the gp_annotation.ace and exp_goid in ExprOA and output:

a) count and WBpaper list of papers that have matching GO annotations
b) count and WBpaper list of papers that have NOT matching GO annotations

Example matching: 
nn WBpapers in ExprOA and in gp_annotation.ace with matching annotations:
WBPaper00000076
WBPaper00003456
WBPaper00023456
WBPaper00056789
…

Example NOT matching: 
nn WBpapers in ExprOA and in gp_annotation.ace with non-matching annotations:
WBPaper00000098: gene1 LegoCC <‘GO_term "GO:000nnnn”’>; ExprOACC <exp_goid>
WBPaper00000098: gene2 LegoCC <‘GO_term "GO:000nnnn”’>; ExprOACC <null>
WBPaper00000098: gene3 LegoCC <null>; ExprOACC <exp_goid>

for the genes that have lego GO_CC and no GO term annotation in OA we should import (only IDA) objects that should be further evaluated by Expression curator to add transgene antibody info, should be labeled with Kimberly as curator and have a no dump.