Difference between revisions of "Specifications for a DAF for gene-disease data"

From WormBaseWiki
Jump to navigationJump to search
Line 43: Line 43:
 
|14||Modifier - qualifier||O||0 or 1||NOT||Used to indicate DB object is not <br/>associated with the DO term/association type
 
|14||Modifier - qualifier||O||0 or 1||NOT||Used to indicate DB object is not <br/>associated with the DO term/association type
 
|-
 
|-
|15||Modifier (genetic)||O||0 or greater||DB:gene_symbol<br/>DB:gene_symbol[allele_symbol]
+
|15||Modifier (genetic)||O||0 or greater||DB:gene_symbol<br/>DB:gene_symbol(allele_symbol)
 
<br/>DB:gene_id<br/>DB:allele_id||Specifies a genetic object (allele/gene) that modifies the disease model
 
<br/>DB:gene_id<br/>DB:allele_id||Specifies a genetic object (allele/gene) that modifies the disease model
 
|-
 
|-

Revision as of 00:24, 28 January 2017

New AGR Disease Working Group specified DAF

  • These specifications have been discussed and specified by the Disease Working Group of the AGR
  • Annotations from the computational pipeline that WB currently has for assigning 'Potential models' will not be included in the MOD-specific DAFs. The plan is to have a separate AGR orthology pipeline to link human genes, orthologs from model organisms and disease.
  • Note that all models are not in place yet in WB, though some of the data is already being curated in the Caltech Ontology Annotator curation tool for disease.


DAF columns and mapping of gene-disease data to columns
Column Content Required(R)
Optional(O)
Cardinality Example Definition Model tag New/Exists Comment
1 Taxon R 1 taxon:6239 NCBI taxonomic identifier for the organism
2 DB Object Type R 1 gene, allele, transgene, genotype, fish The type of object being annotated
3 DB R 1 WB The database from which the identifiers in 'DB object ID' and 'DB Object Symbol' are drawn.
4 DB Object ID R 1 WB:WBGene00004887 A unique identifier from the database in DB for the entity being annotated
5 DB Object Symbol R 1 smn-1 A (unique and valid) symbol to which DB object ID is matched
6 Inferred Gene Association O 0 or greater WB:WBGene00004887 Database ID for inferred gene/marker association that can be made based on the DB object ID
7 Gene Product Form ID O 0 or 1 UniProtKB id or PRO ID this field allows the annotation
of specific variants of that gene or gene product.
8 Experimental Conditions
(to create the model)
O 0 or greater standard conditions
chemical/drug treatments (ChEBI ID, ZECO)
dietary manipulations (specifiy entity)
surgery/amputation
bacterial/virus exposure (taxon ID)
Experimental/environmental (i.e. non-genetic) conditions
required for the model, used particularly for induced models
9 Association Type R 1 is_model_of
causes_or_contributes_to_condition

causes_condition
contributes_to_condition
is_marker_for||Relationship between the DB object and the disease

10 Qualifier O 0 or 1 NOT Used to indicate that the DB object is not
associated with the DO term/association type
11 DO ID R 1 DOID:12858 DO identifier for disease
12 With O 0 or greater DB:gene_symbol
DB:gene_id

DB:gene_symbol[allele_symbol]
DB:allele_id||EITHER: - specifies additional genetic components of the model (where DB Object Type is not genotype/strain/fish) OR
- specifies the orthologous (usually human) gene in annotations with ‘ISS/ISO’ evidence code

13 Modifier Association Type O 0 or 1 condition_ameliorated_by
condition_exacerbated_by
Relationship between the modifier and the disease model
14 Modifier - qualifier O 0 or 1 NOT Used to indicate DB object is not
associated with the DO term/association type
15 Modifier (genetic) O 0 or greater DB:gene_symbol
DB:gene_symbol(allele_symbol)


DB:gene_id
DB:allele_id||Specifies a genetic object (allele/gene) that modifies the disease model

16 Modifier - experimental conditions O 0 or greater standard conditions

chemical/drug treatments (ChEBI ID, ZECO)
dietary manipulations (specifiy entity)
surgery/amputation
bacterial/virus exposure (taxon ID)||Specifies a non-genetic object experimental condition
that modifies the disease model

17 Evidence Code R 1 or greater EXP, IMP, IPM, IGI, IDA, IED, IEP,
IAGP

ISS, ISO, TAS, IC, IEA||From GO: Indicates the kind of evidence in the cited source that supports the disease annotation.
If reference describes multiple methods that each provide evidence, then multiple annotations should be made with same DO term and different evidence codes.

18 Genetic sex O 0 or 1 male/female/hermaphrodite genetic sex of the model
19 DB:Reference R 1 PMID:14978262 unique identifier(s) for a single source cited as an authority for the attribution of the DO ID to the DB object ID
20 Date R 1 20090118 Date on which the annotation was made; format is YYYYMMDD
21 Assigned By R 1 WB The database which made the annotation - one of the values from the set of GO database cross-references

Current WormBase DAF (pre-AGR)

DAF 2.0 for gene-disease data includes all genes with the Experimental_model and/or Potential_model tags.

Format: The gene-disease association file is a 17 column tab-delimited file, where 11 columns have to have data and 6 are optional.


Mapping of GAF column to gene-disease data
Column Column Name Required? Cardinality Example
1 DB required 1 WB
2 DB Object ID required 1 WBGene00007799
3 DB Object Symbol required 1 nrx-1
4 Qualifier optional 0 or greater (no value, leave empty)
5 GO ID required 1 DOID:0060041
6 DB:Reference (|DB:Reference) required 1 or greater
(separate values with pipes)
WBPaper00041363|WBPaperXXXXXXXX
7 Evidence code required 1 IMP or ISS (use 'IMP' for Experimental_model
genes and ISS for Potential_model genes)
8 With (or) From optional 0 or greater
(separate values with pipes)
OMIM:600565|OMIM:600566
9 Aspect required 1 D (D for disease ontology, all annotations
have this value)
10 DB Object Name optional 0 or 1 (no value, leave empty)
11 DB Object Synonym (|Synonym) optional 0 or greater (no value, leave empty)
12 DB Object Type required 1 gene (all annotations have this value)
13 Taxon required 1 or 2 taxon:6239
14 Date required 1 20130422 (this is the date of annotation dumped under Date_last_updated;
for potential_model genes, use the date on which the OMIM homology script is run)
15 Assigned By required 1 WB
16 Annotation Extension optional 0 or greater (no value, leave empty)
17 Gene Product Form ID optional 0 or greater (no value, leave empty)



Back To Disease and Drugs