Model changes to capture and consolidate human disease data

From WormBaseWiki
Revision as of 22:04, 28 November 2012 by Rkishore (talk | contribs)
Jump to navigationJump to search

A new tag ‘Disease_info’ proposed to consolidate disease-related data in WormBase:

DB_info      Database                     ?Database ?Database_field Text
Disease_info Experimental_model_for_human ?DO_term XREF Gene_by_biology #Evidence	            
             Potential_model_for_human	?DO_term  XREF	Gene_by_orthology	#Evidence
             Human_disease_relevance	?Text	#Evidence //moved from ‘structured description’ tag.

Model for Disease Ontology Term:

Name  ?Text
Status UNIQUE      Valid
Alternate_id       ?Text
Definition         ?Text
Comment            Text
Synonymn           ?Text Scope_modifier UNIQUE Broad
Relationship      is_a  ?DO_term
DB_info    Database  ?Database  ?Database_field   Text //to reference OMIM              
Replaced_by        ?DO_term
Subset             Text                  
Created_by         Text
Creation_date      Text             
Attribute_of       Gene_by_biology    ?Gene       XREF   DO_term 
                   Gene_by_orthology  ?Gene       XREF   DO_term
                   Phenotype  ?Phenotype  XREF   DO_term
                   WBProcess  ?WBProcess  XREF   DO_term
                   Reference  ?Paper      XREF   DO_term 
Index   Ancestor   ?DO_term      //Consider transitivity, needed for web display?
        Descendent ?DO_term      //Consider transitivity, needed for web display?
Version UNIQUE Text            //revision number

Paul D's suggestions/corrections: For the 'Disease_info' tag under ?Gene:

  • Add ?Species tag instead of 'Experimental_model_for_human' type tags, to indicate species information.

For ?DO_term model:

  • Drop 'Created_by' and 'Creation_date', not useful information to import, check original source of information if needed.
  • For the 'subset' tag, follow standard procedure across models:
Type       GOLD

Ideally this would have been UNIQUE but there are 178 DO_terms with multiple subset lines populated.

Or is there something better that maintains flexibility but controls the vocabulary better that Text

  • This tag structure isn't workable as there isn't a general DO_term tag in the ?Gene model and it's a 2:1.
Attribute_of         Gene_by_biology    ?Gene       XREF   DO_term 
                     Gene_by_orthology  ?Gene       XREF   DO_term
  • This is also not a permitted tag combination, you aren't allowed to build a branch off a Text/?Text tag, this might be a bug in the acedb code so I have raised a ticket with the acedb dev team, but as of now it's a show stopper, could you re-think this section.
Synonymn ?Text Scope_modifier UNIQUE Broad
  • Relationships--Could we also modify the Relationship section as the proposed tag names are new and relationships are used in multiple other classes, could you copy one of the other models as I just created some test data and had the relationship reversed because of the tag names and I wasn't being careful. We should try and re-use common tag structures, that was if there are enough of them we can move them into a Hash to simplify the models file.
Lineage      Parent_term  UNIQUE  ?Anatomy_term XREF Daughter_term
             Daughter_term        ?Anatomy_term XREF Parent_term
        Parent Is_a ?SO_term XREF Is
               Part_of ?SO_term XREF Part
               Derived_from ?SO_term XREF Derives
               Member_of ?SO_term XREF Member
        Child  Is ?SO_term XREF Is_a
               Part ?SO_term XREF Part_of
               Derives ?SO_term XREF Derived_from
               Member ?SO_term XREF Member_of
       Lineage Parent  UNIQUE  ?Cell XREF Daughter
               Daughter        ?Cell XREF Parent
       Child     Instance ?GO_term XREF Instance_of
                 Component ?GO_term XREF Component_of
       Parent    Instance_of ?GO_term XREF Instance
                 Component_of ?GO_term XREF Component
  • Index and Relationship--Are these not storing the same data?