Difference between revisions of "WS236 Models.wrm"

From WormBaseWiki
Jump to navigationJump to search
(→‎* ?Gene and Disease Ontology - Ranjana: modified tag names to remove - characters.)
 
(19 intermediate revisions by the same user not shown)
Line 27: Line 27:
 
I'd like to change tags in the ?Gene model and #Gene_history_action hash.
 
I'd like to change tags in the ?Gene model and #Gene_history_action hash.
  
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
#Gene_history_action Event  Created
 
#Gene_history_action Event  Created
 
                             Killed
 
                             Killed
Line 76: Line 80:
 
</pre>
 
</pre>
  
=== * Sequence collection - Kevin ===
+
=== * Species/Sequence collection - Kevin ===
  
 
When we added the Assembly Sequence_collection to the ?Species class, we made it unique, in the assumption that there would only ever be one current assembly for each species.
 
When we added the Assembly Sequence_collection to the ?Species class, we made it unique, in the assumption that there would only ever be one current assembly for each species.
  
Turns out we were wrong. So I therefore propose to remove the UNIQUE.  
+
Turns out we were wrong. So I therefore propose to remove the UNIQUE.
  
=== * Starin modifications ===
+
=== * Strain modifications ===
  
 
==== ?Strain - Michael====
 
==== ?Strain - Michael====
 
I would like to propose removal of the unused tags:
 
I would like to propose removal of the unused tags:
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
Reference_strain - changes to the Sequence_collection class we would be able to identify a reference strain.
 
Reference_strain - changes to the Sequence_collection class we would be able to identify a reference strain.
 
Males - has been superseded by the phenotype ontology
 
Males - has been superseded by the phenotype ontology
Line 107: Line 115:
 
?Strain
 
?Strain
 
     Isolation GPS UNIQUE Float UNIQUE Float #Position_confidence
 
     Isolation GPS UNIQUE Float UNIQUE Float #Position_confidence
          Elevation UNIQUE Unique Float #Position_confidence
+
              Elevation UNIQUE Unique Float #Position_confidence
  
 
#Position_confidence Exact
 
#Position_confidence Exact
Line 123: Line 131:
 
=== * ?Variation and ?Feature mapping changes ===
 
=== * ?Variation and ?Feature mapping changes ===
  
 +
==== Mapping ====
 
This models update unifies the mapping of these 2 classes and provides a second (better) method for mapping genomic features which we will be using in the near future.
 
This models update unifies the mapping of these 2 classes and provides a second (better) method for mapping genomic features which we will be using in the near future.
  
<pre>
+
It has been pointed out that 1bp features cannot infer strand solely based on the 2 stored co-ordinates so storing the source strand where available might be useful. As the majority of this data will be coming from gff, we will adopt the gff strand convention (+-.)
 +
 
 +
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
?Variation
 
?Variation
 
SMap S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
 
SMap S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
 
     Sequence_details Flanking_sequences UNIQUE Text UNIQUE Text  //No change
 
     Sequence_details Flanking_sequences UNIQUE Text UNIQUE Text  //No change
 
Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
 
Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int //data provided by paper/submitter/project
+
Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int UNIQUE Text #Evidence //data provided by paper/submitter/project
  
 
?Feature
 
?Feature
Line 136: Line 151:
 
     Sequence_details Flanking_sequences UNIQUE Text UNIQUE Text  //Removal of leading ?Sequence connection
 
     Sequence_details Flanking_sequences UNIQUE Text UNIQUE Text  //Removal of leading ?Sequence connection
 
Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
 
Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int //data provided by paper/submitter/project
+
Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int UNIQUE Text #Evidence //data provided by paper/submitter/project
 
</pre>
 
</pre>
  
 +
==== Gary W. ====
 +
Would also like to add a Text description of the type of score followed by an #Evidence hash to the ?Feature model to follow Score
 +
 +
<pre>
 +
?Feature Score Float Text #Evidence
 +
</pre>
  
 
=== * ?Gene and Disease Ontology - Ranjana ===
 
=== * ?Gene and Disease Ontology - Ranjana ===
  
 
Proposed and working through Ideas with Hinxton.
 
Proposed and working through Ideas with Hinxton.
 +
 +
[[Model_changes_to_capture_and_consolidate_human_disease_data | Background]]
  
 
?Gene additions
 
?Gene additions
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
?Gene
 
?Gene
DB_info Database ?Database ?Database_field Text
+
DB_info Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
Disease_info Human_experimental_model ?DO_term XREF Gene_by_biology #Evidence              
+
Disease_info Experimental_model ?DO_term XREF Gene_by_biology   ?Species  #Evidence              
              Human_potential_model ?DO_term XREF Gene_by_orthology #Evidence
+
            Potential_model   ?DO_term XREF Gene_by_orthology ?Species #Evidence
              Human_disease_relevance ?Text #Evidence
+
            Disease_relevance  ?Text ?Species #Evidence
 
</pre>
 
</pre>
  
 
?DO_term - New class
 
?DO_term - New class
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
?DO_term  
 
?DO_term  
Name  UNIQUE             ?Text
+
Name  UNIQUE               ?Text
Status UNIQUE              Valid
+
Status UNIQUE              Valid
                              Obsolete
+
                          Obsolete
Alternate_id                     ?Text
+
Alternate_id               ?Text
Definition UNIQUE         ?Text
+
Definition UNIQUE         ?Text
Comment           Text
+
Comment                   Text
Synonymn          ?Text Scope_modifier UNIQUE Broad
+
Synonymn          Broad  ?Text
                                                Exact
+
                  Exact   ?Text
                                                Narrow
+
                  Narrow ?Text
                                                Related
+
                  Related ?Text
Relationship        Is_a_child ?DO_term  XREF  Is_a_parent
+
Parent            Is_a ?DO_term  XREF  Is
                  Is_a_parent ?DO_term  XREF  Is_a_child
+
Child              Is         ?DO_term  XREF  Is_a
DB_info           Database   ?Database  ?Database_field  Text               
+
DB_info           Database     ?Database  ?Database_field  Text               
Replaced_by                   ?DO_term
+
Type              GOLD                    
  Subset                              Text                 
+
                  gram_negative_bacterial_infectious_disease //changed - for _ in tag names.
Created_by                      Text
+
                  gram_positive_bacterial_infectious_disease
Creation_date                  Text           
+
                  sexually_transmitted_infectious_disease
Attribute_of Gene_by_biology    ?Gene      XREF  DO_term
+
                  tick_borne_infectious_disease
                    Gene_by_orthology  ?Gene      XREF  DO_term
+
                  zoonotic_infectious_disease
                    Phenotype  ?Phenotype  XREF  DO_term
+
Attribute_of       Gene_by_biology    ?Gene      XREF  Experimental_model
                    WBProcess  ?WBProcess  XREF  DO_term
+
                  Gene_by_orthology  ?Gene      XREF  Potential_model
                    Reference  ?Paper      XREF  DO_term  
+
                  Phenotype  ?Phenotype  XREF  DO_term
Index Ancestor  ?DO_term  XREF Descendent     
+
                  WBProcess  ?WBProcess  XREF  DO_term
        Descendent ?DO_term  XREF Ancestor 
+
                  Reference  ?Paper      XREF  DO_term  
Version UNIQUE Text    
+
Version           UNIQUE Text
 
</pre>
 
</pre>
  
 +
Other class modifications because of XREF/References
  
Possible changes
+
<pre>
 +
?Phenotype
 +
DO_term ?DO_term #Evidence
 +
 +
?WBProcess
 +
DO_term ?DO_term #Evidence
  
1) further eradication of the species from the tags in favor of a ?Species tag in the model branch
+
?Paper
 +
DO_term ?DO_term #Evidence
 +
</pre>
  
2) Couple of issues with broken tag structures/Models
+
Possible changes
  
3) Data storage questions
+
Most previously mentioned have been resolved
  Created_by Text
 
  Version Text
 
  Subset Text
 
  Creation_date Text -> Creation_date DateType
 
  
4) Possible redundancy between Index and Parent/Child
+
1) Possible redundancy between Index and Parent/Child
  
5) Standardisation of Parent/Child as there are multiple ontologies in the schema, all doing things differently.
+
Postponed tags:
 +
<pre>
 +
Index              Ancestor  ?DO_term  XREF Descendent     
 +
                  Descendent ?DO_term  XREF Ancestor
 +
</pre>
  
 
=== * Condition Class - Wen ===
 
=== * Condition Class - Wen ===
Line 208: Line 247:
  
 
Temperature Float
 
Temperature Float
 +
</pre>
 +
  
</pre>
+
&copy; WormBase 2012

Latest revision as of 15:31, 14 December 2012

* General housekeeping - ?Accession class

?Accession class retirement phase 1

  • Remove simple examples of:
Database ?Database ?Database_field ?Accession 

from the models file in favor of:

?Database_field Text

This will significantly reduce the numbers of empty objects in the ?Accession class

* Transposon gene reactivation - Paul

Poor nomenclature in ?Gene history and Event

Model changes to ?Gene and ?Transposon

I'd like to change tags in the ?Gene model and #Gene_history_action hash.

#Gene_history_action Event  Created
                            Killed
                            Made_into_transposon        // for CDSs that become Transposon CDSs - no longer count as Live Gene

#Gene_history_action Event  Created
                            Killed
                            Transposon_in_origin        // for genes that are identified as having CDSs/Proteins that are Transposon in origin

?Gene
History Version_change Int UNIQUE DateType UNIQUE ?Person #Gene_history_action
    Made_into_transposon

?Gene
History Version_change Int UNIQUE DateType UNIQUE ?Person #Gene_history_action
    Transposon_in_origin

As this was poorly thought out and is misleading (gets a high status position in the Overview widget which is good, but tag name has issues)

Gene::Transposon

I'd also like to make a connection between Gene and Transposon

?Gene
        Allele ?Variation XREF Gene #Evidence
    Corresponding_transposon ?Transposon XREF Gene #Evidence

?Transposon
    Corresponding_gene ?Gene XREF Corresponding_transposon #Evidence 

#Gene_history_action

I just spotted an additional change the the #Gene_history_action model that might be needed?

2 options.

1) Just need to remove the killed from the "Suppressed genes. A use case would be the Transposon_in_origin genes where the scripts would only used the "Transposon_in_origin tag in the History tag structure.

2) Talking with Mary Ann, even though this is overkill, it fills an omission whereby all tags in the #Gene_history_action can be represented as a rooted tag within the object and for consistency this should be added.

addition of:

#Gene_history_action Event Suppressed 

* Species/Sequence collection - Kevin

When we added the Assembly Sequence_collection to the ?Species class, we made it unique, in the assumption that there would only ever be one current assembly for each species.

Turns out we were wrong. So I therefore propose to remove the UNIQUE.

* Strain modifications

?Strain - Michael

I would like to propose removal of the unused tags:

Reference_strain - changes to the Sequence_collection class we would be able to identify a reference strain.
Males - has been superseded by the phenotype ontology

?Strain - Mary Ann

Sample_history

renamed to

Strain_history

M-A Felix wants this renamed and as of now it has only been used once and is not displayed on the site.

?Strain - Mary Ann

?Strain
     Isolation GPS UNIQUE Float UNIQUE Float #Position_confidence
               Elevation UNIQUE Unique Float #Position_confidence

#Position_confidence Exact
                     Approximate
                     Inferred_from_GPS


Michael suggested a UNIQUE Fload after the Approximate tag to specify the estimated error range (+-360.0) but this might be overkill.

* ?Drug_resistance

Mary Ann proposed a model but upon internal discussion it seemed that this might be better placed with Karen at Caltech.

* ?Variation and ?Feature mapping changes

Mapping

This models update unifies the mapping of these 2 classes and provides a second (better) method for mapping genomic features which we will be using in the near future.

It has been pointed out that 1bp features cannot infer strand solely based on the 2 stored co-ordinates so storing the source strand where available might be useful. As the majority of this data will be coming from gff, we will adopt the gff strand convention (+-.)

?Variation
SMap 	S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
    	Sequence_details	Flanking_sequences UNIQUE Text UNIQUE Text  //No change
				Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
				Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int UNIQUE Text #Evidence //data provided by paper/submitter/project

?Feature
SMap 	S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
    	Sequence_details	Flanking_sequences UNIQUE Text UNIQUE Text  //Removal of leading ?Sequence connection
				Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
				Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int UNIQUE Text #Evidence //data provided by paper/submitter/project

Gary W.

Would also like to add a Text description of the type of score followed by an #Evidence hash to the ?Feature model to follow Score

?Feature Score Float Text #Evidence

* ?Gene and Disease Ontology - Ranjana

Proposed and working through Ideas with Hinxton.

Background

?Gene additions

?Gene
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
Disease_info 	Experimental_model ?DO_term XREF Gene_by_biology   ?Species   #Evidence	            
             	Potential_model	   ?DO_term XREF Gene_by_orthology ?Species #Evidence
             	Disease_relevance  ?Text ?Species #Evidence

?DO_term - New class

?DO_term 
Name  UNIQUE               ?Text
Status UNIQUE              Valid
                           Obsolete
Alternate_id               ?Text
Definition UNIQUE          ?Text
Comment                    Text
Synonymn           Broad   ?Text
                   Exact   ?Text
                   Narrow  ?Text
                   Related ?Text
Parent             Is_a  	?DO_term  XREF  Is
Child              Is 	        ?DO_term  XREF  Is_a 
DB_info            Database     ?Database  ?Database_field   Text              
Type               GOLD                   
                   gram_negative_bacterial_infectious_disease  //changed - for _ in tag names.
                   gram_positive_bacterial_infectious_disease
                   sexually_transmitted_infectious_disease
                   tick_borne_infectious_disease
                   zoonotic_infectious_disease
Attribute_of       Gene_by_biology    ?Gene       XREF   Experimental_model
                   Gene_by_orthology  ?Gene       XREF   Potential_model
                   Phenotype  ?Phenotype  XREF   DO_term
                   WBProcess  ?WBProcess  XREF   DO_term
                   Reference  ?Paper      XREF   DO_term 
Version            UNIQUE Text

Other class modifications because of XREF/References

?Phenotype
DO_term ?DO_term #Evidence
 
?WBProcess
DO_term ?DO_term #Evidence

?Paper 
DO_term ?DO_term #Evidence

Possible changes

Most previously mentioned have been resolved

1) Possible redundancy between Index and Parent/Child

Postponed tags:

Index              Ancestor   ?DO_term   XREF Descendent      
                   Descendent ?DO_term   XREF Ancestor 

* Condition Class - Wen

Change the temparature Int to a Float

Temperature Int

Temperature Float


© WormBase 2012