Difference between revisions of "WormBase gene association file"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
 
====Generating the WormBase gene association file====
 
====Generating the WormBase gene association file====
 
In the acedb user account on Tazendra at:/home/acedb/ranjana/GO:
 
In the acedb user account on Tazendra at:/home/acedb/ranjana/GO:
#Use ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS211/ONTOLOGY/gene_association.WS211.wb.ce to get the latest gene association file generated by WormBase EBI.
+
#Use ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS211/ONTOLOGY/ eg., to get the latest gene association file generated by WormBase EBI eg., gene_association.WS211.wb.ce.
 
#Use 'grep IEA gene_association.WSXXX.wb.ce > gene_association.wb.electronic to separate the IEAs.
 
#Use 'grep IEA gene_association.WSXXX.wb.ce > gene_association.wb.electronic to separate the IEAs.
 
#Grep WBPhenotype gene_association.WSXXX.wb.ce > gene_association.wb.rnai2go(to get i.e both Erich's earlier RNAi2GO ones and the new associations based on allele phenotypes that went into WormBase WS186).
 
#Grep WBPhenotype gene_association.WSXXX.wb.ce > gene_association.wb.rnai2go(to get i.e both Erich's earlier RNAi2GO ones and the new associations based on allele phenotypes that went into WormBase WS186).
Line 8: Line 8:
 
#Run the ./wrapper.pl script, note this script generates both go.ace and go.go (gene association format file) in the go_dumper_files directory:
 
#Run the ./wrapper.pl script, note this script generates both go.ace and go.go (gene association format file) in the go_dumper_files directory:
 
#./wrapper.pl gives the following output:
 
#./wrapper.pl gives the following output:
filtering secondary GO IDs into .no_secondary filenames now :
+
filtering secondary GO IDs into .no_secondary filenames now and outputs NUMBER of ERRORS by COLUMN:
doing manual now :
+
#Run strip_errors_and_concatenate.pl, this script also generates gene, GO term numbers.
filtering secondary GO IDs into .no_secondary filenames now :
 
doing manual now :
 
and outputs NUMBER of ERRORS by COLUMN
 

Revision as of 18:03, 23 October 2012

Generating the WormBase gene association file

In the acedb user account on Tazendra at:/home/acedb/ranjana/GO:

  1. Use ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS211/ONTOLOGY/ eg., to get the latest gene association file generated by WormBase EBI eg., gene_association.WS211.wb.ce.
  2. Use 'grep IEA gene_association.WSXXX.wb.ce > gene_association.wb.electronic to separate the IEAs.
  3. Grep WBPhenotype gene_association.WSXXX.wb.ce > gene_association.wb.rnai2go(to get i.e both Erich's earlier RNAi2GO ones and the new associations based on allele phenotypes that went into WormBase WS186).
  4. copy the right go.go.<date> file from /home/acedb/ranjana/citace_upload/go_curation/go_dumper_files/ to this directory,change name to gene_association.wb.manual.
  5. Cd to the /external_annots directory and download a new GOA elegans file, from 04.02.12, for external annots (use 'wget ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/9.C_elegans.goa')
  6. Run the ./wrapper.pl script, note this script generates both go.ace and go.go (gene association format file) in the go_dumper_files directory:
  7. ./wrapper.pl gives the following output:

filtering secondary GO IDs into .no_secondary filenames now and outputs NUMBER of ERRORS by COLUMN:

  1. Run strip_errors_and_concatenate.pl, this script also generates gene, GO term numbers.