Difference between revisions of "Phenotype2GO pipeline SOP"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
'''Phenotype2GO pipeline (Combination of Sanger and Caltech):'''
+
=====Phenotype2GO pipeline (Sanger and Caltech)=====
  
The old Sanger script that generates the gene_association file (from Igor's work in January 2009) was changed:
+
*The old Sanger script that generates the gene_association file (from Igor's work in January 2009) was changed. Instead of an exclusion list and 'include list' that comprises papers (mostly large scale genome-wide studies) is provided to the script. This list is curator approved and explicitly agreed upon for the propagation of GO terms to genes based on their RNAi phenotypes.  
 
 
*Instead of excluding some papers while attaching GO terms to genes based on phenotypes from RNAi experiments (the so-called 'exclude list'), an 'include list' that comprises papers (mostly large scale genome-wide studies) is provided to the script. This list is curator approved and explicitly agreed upon for the propagation of GO terms to genes based on their RNAi phenotypes.  
 
  
 
*A new script is used, to use it invoke the script with the -includelist option, e.g.: Run parse_go_terms_new.pl -o gene_association.wb -rnai -include includelist.txt (this example only parses RNAi experiments, to generate full file, you should also give '-gene -var' options as before).
 
*A new script is used, to use it invoke the script with the -includelist option, e.g.: Run parse_go_terms_new.pl -o gene_association.wb -rnai -include includelist.txt (this example only parses RNAi experiments, to generate full file, you should also give '-gene -var' options as before).
Line 9: Line 7:
 
*If you invoke it with '-acefile <filename>' option, the script will also generate Gene-GO_term connections derived from phenotypes. This is currently done by the phenotype procedure of the inherit_GO_terms.pl script.  
 
*If you invoke it with '-acefile <filename>' option, the script will also generate Gene-GO_term connections derived from phenotypes. This is currently done by the phenotype procedure of the inherit_GO_terms.pl script.  
  
*This script: inherit_GO_terms.pl does not consult any exclusion/inclusion files
+
*The old script: inherit_GO_terms.pl does not consult any exclusion/inclusion files.
**Follows a different logic from parse_go_terms_new.pl, this results in disparity between the gene_association file and the data that ends up in acedb.
 
**So the phenotype procedure of inherit_GO_terms.pl needs to be disabled and the ace file generated by the new script needs to be used.
 
**We do not want every gene with a phenotype to get a GO_term from the phenotype2GO mapping file, but just the genes from 'genome-wide' papers that have been reviewed.
 
**To alter Sanger's version of parse_go_terms_new.pl, a patch file was provided.  
 
  
 +
*To alter Sanger's version of parse_go_terms_new.pl, a patch file was provided.
  
 
+
*Current status:From Igor's e-mail, March 2009: I don't think the phenotype option of the inherit_go_terms script has been disabled.  The script should be run without the '-variation' option, but the gene_association file still has those. Try this:
'''Current status:'''
 
From Igor's e-mail, March 2009: I don't think the phenotype option of the inherit_go_terms script has been disabled.  The script should be run without the '-variation' option, but the gene_association file still has those. Try this:
 
 
grep -i wbpheno gene_association.WS200.wb.ce |grep -v RNAi
 
grep -i wbpheno gene_association.WS200.wb.ce |grep -v RNAi
 
+
This is now resolved.
  
  

Revision as of 04:13, 8 September 2010

Phenotype2GO pipeline (Sanger and Caltech)
  • The old Sanger script that generates the gene_association file (from Igor's work in January 2009) was changed. Instead of an exclusion list and 'include list' that comprises papers (mostly large scale genome-wide studies) is provided to the script. This list is curator approved and explicitly agreed upon for the propagation of GO terms to genes based on their RNAi phenotypes.
  • A new script is used, to use it invoke the script with the -includelist option, e.g.: Run parse_go_terms_new.pl -o gene_association.wb -rnai -include includelist.txt (this example only parses RNAi experiments, to generate full file, you should also give '-gene -var' options as before).
  • If you invoke it with '-acefile <filename>' option, the script will also generate Gene-GO_term connections derived from phenotypes. This is currently done by the phenotype procedure of the inherit_GO_terms.pl script.
  • The old script: inherit_GO_terms.pl does not consult any exclusion/inclusion files.
  • To alter Sanger's version of parse_go_terms_new.pl, a patch file was provided.
  • Current status:From Igor's e-mail, March 2009: I don't think the phenotype option of the inherit_go_terms script has been disabled. The script should be run without the '-variation' option, but the gene_association file still has those. Try this:

grep -i wbpheno gene_association.WS200.wb.ce |grep -v RNAi This is now resolved.