|
|
Line 1: |
Line 1: |
− | ====Generating the WormBase gene association file====
| |
− | In the acedb user account on Tazendra at:/home/acedb/ranjana/GO:
| |
− | #Use ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS211/ONTOLOGY/ eg., to get the latest gene association file generated by WormBase EBI eg., gene_association.WS211.wb.ce.
| |
− | #Use 'grep IEA gene_association.WSXXX.wb.ce > gene_association.wb.electronic to separate the IEAs.
| |
− | #Grep WBPhenotype gene_association.WSXXX.wb.ce > gene_association.wb.rnai2go(to get i.e both Erich's earlier RNAi2GO ones and the new associations based on allele phenotypes that went into WormBase WS186).
| |
− | #copy the right go.go.<date> file from /home/acedb/ranjana/citace_upload/go_curation/go_dumper_files/ to this directory,change name to gene_association.wb.manual.
| |
− | #Cd to the /external_annots directory and download the latest GOA elegans file,for external annots (use 'wget ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/9.C_elegans.goa')
| |
− | #Run the ./wrapper.pl script, note this script generates both go.ace and go.go (gene association format file) in the go_dumper_files directory:
| |
− | #./wrapper.pl gives the following output: filtering secondary GO IDs into .no_secondary filenames now and outputs NUMBER of ERRORS by COLUMN:
| |
− | #Run strip_errors_and_concatenate.pl, this script also generates the file genecounts that has gene, GO term numbers.
| |
| | | |
− | On your local machine:
| |
− | #Work in the tmp directory on Maya:
| |
− | #scp file to Maya
| |
− | #remove 'NOT' annotations from mtm-9, vha-2, vha-3, hsp-60, hsp-12.3, hsp-12.6. (We do not take out NOT annotations anymore)
| |
− | #remove header from the middle of concatenated file in two places (on top of UniProt file too, search for 'gaf-version') and placed on top of file (correct minor mistake in header--space after the $ on one of the lines)
| |
− | #And move the following header from the middle of file to the top of file:
| |
− | !Version: $Revision: $
| |
− |
| |
− | !Organism: Caenorhabditis elegans
| |
− |
| |
− | !date: $Date: $
| |
− |
| |
− | !From: WormBase
| |
− |
| |
− | Add these two lines at the bottom of header:
| |
− |
| |
− | !DataBase_Project_Name: WormBase WS215/WS216
| |
− |
| |
− | !gaf-version: 2.0
| |
− |
| |
− | TO upload to the GO consortium:
| |
− | #Remove the header 'gaf 2.0', from the top of the UniProt file
| |
− | #gzip file
| |
− | #Copy file to the tmp directory
| |
− | #CVS_RSH=ssh; export CVS_RSH at terminal to use the RSH program
| |
− | #NOTE-Upload:Stopped using the -z option to during: remove 'z3' from 'qz3'
| |
− | #cvs -q -d :ext:ranjana@ext.geneontology.org:/share/go/cvs checkout go/gene-associations/submission/gene_association.wb.gz
| |
− | #Replace old file with the new gene_association.wb.gz under go/gene-associations/submission/
| |
− | #To check in the new file:
| |
− | #cvs -q -d :ext:ranjana@ext.geneontology.org:/share/go/cvs commit go/gene-associations/submission/gene_association.wb.gz
| |
− | #Outputs new version numbers: new revision: 1.103; previous revision: 1.102
| |
− |
| |
− | TO update README:
| |
− | #Checkout README and modify to reflect the version change and any other changes:
| |
− | #cvs -q -d :ext:ranjana@ext.geneontology.org:/share/go/cvs checkout go/gene-associations/readme/WormBase.README
| |
− | #Changes:Updated only for release numbers
| |
− | #For manual annotations column 2(DB_Object_ID) all converted into WBGene IDs, i.e WBGeneXXXXXXXX, from Dec, 2009 upload.
| |
− | #Commit new README (this time modified for only WS numbers):
| |
− | #cvs -q -d :ext:ranjana@ext.geneontology.org:/share/go/cvs commit go/gene-associations/readme/WormBase.README
| |
− | #Outputs new version numbers: new revision: 1.54; previous revision: 1.53 (noted only version changes)
| |
− |
| |
− | Specifications for script that is run by wrapper.pl and generates the gene association file:
| |
− |
| |
− | Column 6: Reference:
| |
− | In the OA: script looks at 'Accession number' field in the OA, and dumps that entry, if empty, looks at the Paper field and dumps that.
| |
− |
| |
− | Back to [[Gene Ontology]]
| |