Difference between revisions of "Model changes to capture and consolidate human disease data"

From WormBaseWiki
Jump to navigationJump to search
Line 39: Line 39:
  
  
'''Paul D's suggestions/corrections:'''
+
'''Paul D's suggestions/corrections for the 'Disease_info' tag under ?Gene::'''
For the 'Disease_info' tag under ?Gene:
+
 
 
*Add ?Species tag instead of 'Experimental_model_for_human' type tags, to indicate species information.
 
*Add ?Species tag instead of 'Experimental_model_for_human' type tags, to indicate species information.
  
So as of 11/28/2012 we have:
+
'''Suggestions/Corrections for the ?DO_term model:'''
?Gene
 
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
 
Disease_info Experimental_model ?DO_term XREF Gene_by_biology  ?Species  #Evidence            
 
              Potential_model   ?DO_term XREF Gene_by_orthology ?Species #Evidence
 
              Disease_relevance  ?Text ?Species #Evidence
 
 
 
For ?DO_term model:
 
 
*Drop 'Created_by' and 'Creation_date', not useful information to import, check original source of information if needed.
 
*Drop 'Created_by' and 'Creation_date', not useful information to import, check original source of information if needed.
 
*For the 'subset' tag, follow standard procedure across models:
 
*For the 'subset' tag, follow standard procedure across models:
Line 123: Line 116:
 
*Index and Relationship--Are these not storing the same data?
 
*Index and Relationship--Are these not storing the same data?
 
Ranjana--I think Relationship is for the immediate parent/child information, whereas under  Index we would list every ancestor and descendant so as to be able to get the complete ancestory.
 
Ranjana--I think Relationship is for the immediate parent/child information, whereas under  Index we would list every ancestor and descendant so as to be able to get the complete ancestory.
 +
 +
'''So as of 11/29/2012 we have:'''
 +
?Gene
 +
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
 +
Disease_info Experimental_model ?DO_term XREF Gene_by_biology  ?Species  #Evidence            
 +
              Potential_model   ?DO_term XREF Gene_by_orthology ?Species #Evidence
 +
              Disease_relevance  ?Text ?Species #Evidence
 +
 +
and
 +
 +
?DO_term
 +
Name  UNIQUE               ?Text
 +
Status UNIQUE              Valid
 +
                            Obsolete
 +
Alternate_id               ?Text
 +
Definition UNIQUE         ?Text
 +
Comment                    Text
 +
Synonymn  UNIQUE     Broad ?Text
 +
    Exact      ?Text
 +
    Narrow ?Text
 +
    Related    ?Text
 +
Parent          Is_a  ?DO_term  XREF  Is
 +
Child              Is ?DO_term  XREF  Is_a
 +
DB_info            Database     ?Database  ?Database_field  Text             
 +
Type                GOLD                 
 +
              gram-negative_bacterial_infectious_disease
 +
    gram-positive_bacterial_infectious_disease
 +
                    sexually_transmitted_infectious_disease
 +
    tick-borne_infectious_disease
 +
                    zoonotic_infectious_disease
 +
Attribute_of     Gene_by_biology   ?Gene      XREF  Experimental_model
 +
    Gene_by_orthology  ?Gene      XREF  Potential_model
 +
    Phenotype  ?Phenotype  XREF  DO_term
 +
    WBProcess  ?WBProcess  XREF  DO_term
 +
                  Reference  ?Paper      XREF  DO_term
 +
Index             Ancestor   ?DO_term  XREF Descendent     
 +
                    Descendent ?DO_term  XREF Ancestor 
 +
Version            UNIQUE Text

Revision as of 19:21, 29 November 2012

A new tag ‘Disease_info’ proposed to consolidate disease-related data in WormBase:

?Gene
DB_info      Database                     ?Database ?Database_field Text
Disease_info Experimental_model_for_human ?DO_term XREF Gene_by_biology #Evidence	            
             Potential_model_for_human	?DO_term  XREF	Gene_by_orthology	#Evidence
             Human_disease_relevance	?Text	#Evidence //moved from ‘structured description’ tag.


Model for Disease Ontology Term:

?DO_term 
Name  UNIQUE               ?Text
Status UNIQUE              Valid
                           Obsolete
Alternate_id               ?Text
Definition UNIQUE          ?Text
Comment                    Text
Synonymn                   ?Text Scope_modifier UNIQUE Broad
                                                       Exact
                                                       Narrow
                                                       Related
Relationship      Is_a_child  	?DO_term  XREF  Is_a_parent
                  Is_a_parent 	?DO_term  XREF  Is_a_child 
DB_info           Database     ?Database  ?Database_field   Text              
Replaced_by                    ?DO_term
Subset                         Text                  
Created_by                     Text
Creation_date                  Text             
Attribute_of	Gene_by_biology    ?Gene       XREF   DO_term 
               Gene_by_orthology  ?Gene       XREF   DO_term
               Phenotype  ?Phenotype  XREF   DO_term
               WBProcess  ?WBProcess  XREF   DO_term
               Reference  ?Paper      XREF   DO_term 
Index	Ancestor   ?DO_term   XREF Descendent      
       Descendent ?DO_term   XREF Ancestor   
Version UNIQUE Text            


Paul D's suggestions/corrections for the 'Disease_info' tag under ?Gene::

  • Add ?Species tag instead of 'Experimental_model_for_human' type tags, to indicate species information.

Suggestions/Corrections for the ?DO_term model:

  • Drop 'Created_by' and 'Creation_date', not useful information to import, check original source of information if needed.
  • For the 'subset' tag, follow standard procedure across models:
?DO_term
Type       GOLD
           gram-negative_bacterial_infectious_disease
           gram-positive_bacterial_infectious_disease
           sexually_transmitted_infectious_disease
           tick-borne_infectious_disease
           zoonotic_infectious_disease

Ideally this would have been UNIQUE but there are 178 DO_terms with multiple subset lines populated.

Or is there something better that maintains flexibility but controls the vocabulary better that Text

Ranjana--If both ?GO_term and ?DO_term were to use 'Subset' and follow this structure, for the sake of uniformity across ontology models, would this be fine or should I switch the tag to 'Type'?

  • This tag structure isn't workable as there isn't a general DO_term tag in the ?Gene model and it's a 2:1.
?DO_term
Attribute_of         Gene_by_biology    ?Gene       XREF   DO_term 
                     Gene_by_orthology  ?Gene       XREF   DO_term
  • This is also not a permitted tag combination, you aren't allowed to build a branch off a Text/?Text tag, this might be a bug in the acedb code so I have raised a ticket with the acedb dev team, but as of now it's a show stopper, could you re-think this section.
?DO_term
Synonymn ?Text Scope_modifier UNIQUE Broad
                                     Exact
                                     Narrow
                                     Related

Would this work?

?DO_term
 Synonymn UNIQUE Broad   ?Text
                 Exact   ?Text
                 Narrow  ?Text 
                 Related ?Text
OR 
?DO_term
Synonymn_broad ?Text
Synonymn_exact ?Text
Synonymn_narrow ?Text
Synonymn_related ?Text 
OR
?DO_term
Synonymn Broad_synonymn ?Text
         Exact_synonymn ?Text
         Narrow_synonymn ?Text
         Related_synonymn ?Text
           
  • Relationships--Could we also modify the Relationship section as the proposed tag names are new and relationships are used in multiple other classes, could you copy one of the other models as I just created some test data and had the relationship reversed because of the tag names and I wasn't being careful. We should try and re-use common tag structures, that was if there are enough of them we can move them into a Hash to simplify the models file.
?Anatomy_term
Lineage      Parent_term  UNIQUE  ?Anatomy_term XREF Daughter_term
             Daughter_term        ?Anatomy_term XREF Parent_term
?SO_term
        Parent Is_a ?SO_term XREF Is
               Part_of ?SO_term XREF Part
               Derived_from ?SO_term XREF Derives
               Member_of ?SO_term XREF Member
        Child  Is ?SO_term XREF Is_a
               Part ?SO_term XREF Part_of
               Derives ?SO_term XREF Derived_from
               Member ?SO_term XREF Member_of
?Cell
       Lineage Parent  UNIQUE  ?Cell XREF Daughter
               Daughter        ?Cell XREF Parent
?GO_term
       Child     Instance ?GO_term XREF Instance_of
                 Component ?GO_term XREF Component_of
       Parent    Instance_of ?GO_term XREF Instance
                 Component_of ?GO_term XREF Component
  • Index and Relationship--Are these not storing the same data?

Ranjana--I think Relationship is for the immediate parent/child information, whereas under Index we would list every ancestor and descendant so as to be able to get the complete ancestory.

So as of 11/29/2012 we have:

?Gene
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
Disease_info 	Experimental_model ?DO_term XREF Gene_by_biology   ?Species   #Evidence	            
             	Potential_model	   ?DO_term XREF Gene_by_orthology ?Species #Evidence
             	Disease_relevance  ?Text ?Species #Evidence

and

?DO_term 
Name  UNIQUE               ?Text
Status UNIQUE              Valid
                           Obsolete
Alternate_id               ?Text
Definition UNIQUE          ?Text
Comment                    Text
Synonymn   UNIQUE	    Broad	?Text

Exact ?Text Narrow ?Text Related ?Text

Parent      	     Is_a  	?DO_term  XREF  Is
Child               Is 	?DO_term  XREF  Is_a 
DB_info             Database     ?Database  ?Database_field   Text              
Type                GOLD                  
         	     gram-negative_bacterial_infectious_disease

gram-positive_bacterial_infectious_disease

                    sexually_transmitted_infectious_disease

tick-borne_infectious_disease

                    zoonotic_infectious_disease
Attribute_of	     Gene_by_biology    ?Gene       XREF   Experimental_model 

Gene_by_orthology  ?Gene XREF Potential_model Phenotype  ?Phenotype XREF DO_term WBProcess  ?WBProcess XREF DO_term

              	     Reference  ?Paper      XREF   DO_term 
Index	             Ancestor   ?DO_term   XREF Descendent      
                    Descendent ?DO_term   XREF Ancestor   
Version             UNIQUE Text