Specifications for WB gpi file

From WormBaseWiki
Revision as of 18:52, 20 March 2013 by Vanaukenk (talk | contribs)
Jump to navigationJump to search

These specifications are based on the documentation on the GO wiki:

http://wiki.geneontology.org/index.php/Final_GPAD_and_GPI_file_format#Final_format_.2809_Jan_2013.29_2

We will need to create a new file with each WormBase release using the xrefs file generated for C. elegans that is available on the ftp site:

ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS236/species/c_elegans/

The file is named according to the release, e.g., c_elegans.WS236.xrefs.txt.gz


column name required? cardinality GAF column Example for UniProt Example for WormBase Column in xrefs file Value if not in xrefs file
01 DB_Object_ID required 1 2/17 Q4VCS5-1 WBGene00000035 2 n/a
02 DB_Object_Symbol required 1 3 AMOT ace-1
03 DB_Object_Name optional 0 or 1 10 Angiomotin n/a n/a
04 DB_Object_Synonym(s) optional 0 or greater 11 KIAA1071|AMOT ACE1 Other_name in ?Gene model
05 DB_Object_Type required 1 12 protein gene n/a gene
06 Taxon required 1 13 taxon:9606 taxon:6239 n/a taxon:6239
07 Parent_Object_ID optional 0 or 1 - UniProtKB:Q4VCS5 WB:WBGene00000035 Gene ID in ?Gene model prefaced with WB:
08 DB_Xref(s) optional 0 or greater - - UniProtKB:P38433
09 Gene_Product_Properties optional 0 or greater - See Note 4 below