Difference between revisions of "Specifications for WB gpi file"

From WormBaseWiki
Jump to navigationJump to search
Line 27: Line 27:
 
| 01 || DB_Object_ID || required || 1 || 2/17 || Q4VCS5-1 || WBGene00000035 || 2 || n/a
 
| 01 || DB_Object_ID || required || 1 || 2/17 || Q4VCS5-1 || WBGene00000035 || 2 || n/a
 
|-
 
|-
| 02 || DB_Object_Symbol || required || 1 || 3 || AMOT || ace-1 ||  
+
| 02 || DB_Object_Symbol || required || 1 || 3 || AMOT || ace-1 || 3, if no value in 3, then 1
 
|-
 
|-
 
| 03 || DB_Object_Name || optional || 0 or 1 || 10 || Angiomotin || n/a || n/a
 
| 03 || DB_Object_Name || optional || 0 or 1 || 10 || Angiomotin || n/a || n/a

Revision as of 18:57, 20 March 2013

These specifications are based on the documentation on the GO wiki:

http://wiki.geneontology.org/index.php/Final_GPAD_and_GPI_file_format#Final_format_.2809_Jan_2013.29_2

We will need to create a new file with each WormBase release using the xrefs file generated for C. elegans that is available on the ftp site:

ftp://ftp.sanger.ac.uk/pub/wormbase/releases/WS236/species/c_elegans/

The file is named according to the release, e.g., c_elegans.WS236.xrefs.txt.gz

Output will be sorted according to ascending WBGene ID.


column name required? cardinality GAF column Example for UniProt Example for WormBase Column in xrefs file Value if not in xrefs file
01 DB_Object_ID required 1 2/17 Q4VCS5-1 WBGene00000035 2 n/a
02 DB_Object_Symbol required 1 3 AMOT ace-1 3, if no value in 3, then 1
03 DB_Object_Name optional 0 or 1 10 Angiomotin n/a n/a
04 DB_Object_Synonym(s) optional 0 or greater 11 KIAA1071|AMOT ACE1 Other_name in ?Gene model
05 DB_Object_Type required 1 12 protein gene n/a gene
06 Taxon required 1 13 taxon:9606 taxon:6239 n/a taxon:6239
07 Parent_Object_ID optional 0 or 1 - UniProtKB:Q4VCS5 WB:WBGene00000035 Gene ID in ?Gene model prefaced with WB:
08 DB_Xref(s) optional 0 or greater - - UniProtKB:P38433
09 Gene_Product_Properties optional 0 or greater - See Note 4 below