Specifications for WB gpi file

From WormBaseWiki
Revision as of 18:59, 20 March 2013 by Vanaukenk (talk | contribs)
Jump to navigationJump to search

These specifications are based on the documentation on the GO wiki:

http://wiki.geneontology.org/index.php/Final_GPAD_and_GPI_file_format#Final_format_.2809_Jan_2013.29_2

We will need to create a new file with each WormBase release using the xrefs file generated for C. elegans that is available on the ftp site:

ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS236/species/c_elegans/

The file is named according to the release, e.g., c_elegans.WS236.xrefs.txt.gz

Output will be sorted according to ascending WBGene ID.


column name required? cardinality GAF column Example for UniProt Example for WormBase Column in xrefs file Value if not in xrefs file
01 DB_Object_ID required 1 2/17 Q4VCS5-1 WBGene00000035 2 n/a
02 DB_Object_Symbol required 1 3 AMOT ace-1 3; if no value in 3, then 1
03 DB_Object_Name optional 0 or 1 10 Angiomotin n/a n/a
04 DB_Object_Synonym(s) optional 0 or greater 11 KIAA1071|AMOT ACE1 Other_name in ?Gene model
05 DB_Object_Type required 1 12 protein gene n/a gene
06 Taxon required 1 13 taxon:9606 taxon:6239 n/a taxon:6239
07 Parent_Object_ID optional 0 or 1 - UniProtKB:Q4VCS5 WB:WBGene00000035 Gene ID in ?Gene model prefaced with WB:
08 DB_Xref(s) optional 0 or greater - - UniProtKB:P38433 8, prefaced with UniProtKB:
09 Gene_Product_Properties optional 0 or greater - See Note 4 below