Difference between revisions of "WS236 Models.wrm"

From WormBaseWiki
Jump to navigationJump to search
(→‎* ?Gene and Disease Ontology - Ranjana: modified tag names to remove - characters.)
 
(11 intermediate revisions by the same user not shown)
Line 27: Line 27:
 
I'd like to change tags in the ?Gene model and #Gene_history_action hash.
 
I'd like to change tags in the ?Gene model and #Gene_history_action hash.
  
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
#Gene_history_action Event  Created
 
#Gene_history_action Event  Created
 
                             Killed
 
                             Killed
Line 76: Line 80:
 
</pre>
 
</pre>
  
=== * Sequence collection - Kevin ===
+
=== * Species/Sequence collection - Kevin ===
  
 
When we added the Assembly Sequence_collection to the ?Species class, we made it unique, in the assumption that there would only ever be one current assembly for each species.
 
When we added the Assembly Sequence_collection to the ?Species class, we made it unique, in the assumption that there would only ever be one current assembly for each species.
  
Turns out we were wrong. So I therefore propose to remove the UNIQUE.  
+
Turns out we were wrong. So I therefore propose to remove the UNIQUE.
  
=== * Starin modifications ===
+
=== * Strain modifications ===
  
 
==== ?Strain - Michael====
 
==== ?Strain - Michael====
 
I would like to propose removal of the unused tags:
 
I would like to propose removal of the unused tags:
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
Reference_strain - changes to the Sequence_collection class we would be able to identify a reference strain.
 
Reference_strain - changes to the Sequence_collection class we would be able to identify a reference strain.
 
Males - has been superseded by the phenotype ontology
 
Males - has been superseded by the phenotype ontology
Line 107: Line 115:
 
?Strain
 
?Strain
 
     Isolation GPS UNIQUE Float UNIQUE Float #Position_confidence
 
     Isolation GPS UNIQUE Float UNIQUE Float #Position_confidence
          Elevation UNIQUE Unique Float #Position_confidence
+
              Elevation UNIQUE Unique Float #Position_confidence
  
 
#Position_confidence Exact
 
#Position_confidence Exact
Line 128: Line 136:
 
It has been pointed out that 1bp features cannot infer strand solely based on the 2 stored co-ordinates so storing the source strand where available might be useful. As the majority of this data will be coming from gff, we will adopt the gff strand convention (+-.)
 
It has been pointed out that 1bp features cannot infer strand solely based on the 2 stored co-ordinates so storing the source strand where available might be useful. As the majority of this data will be coming from gff, we will adopt the gff strand convention (+-.)
  
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
?Variation
 
?Variation
 
SMap S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
 
SMap S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
Line 152: Line 164:
  
 
Proposed and working through Ideas with Hinxton.
 
Proposed and working through Ideas with Hinxton.
 +
 +
[[Model_changes_to_capture_and_consolidate_human_disease_data | Background]]
  
 
?Gene additions
 
?Gene additions
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
?Gene
 
?Gene
 
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
 
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
Line 163: Line 181:
  
 
?DO_term - New class
 
?DO_term - New class
<pre>
+
<pre style="white-space: pre-wrap;
 +
white-space: -moz-pre-wrap;
 +
white-space: -pre-wrap;
 +
white-space: -o-pre-wrap;
 +
word-wrap: break-word">
 
?DO_term  
 
?DO_term  
 
Name  UNIQUE              ?Text
 
Name  UNIQUE              ?Text
Line 179: Line 201:
 
DB_info            Database    ?Database  ?Database_field  Text               
 
DB_info            Database    ?Database  ?Database_field  Text               
 
Type              GOLD                   
 
Type              GOLD                   
                   gram-negative_bacterial_infectious_disease
+
                   gram_negative_bacterial_infectious_disease  //changed - for _ in tag names.
                   gram-positive_bacterial_infectious_disease
+
                   gram_positive_bacterial_infectious_disease
 
                   sexually_transmitted_infectious_disease
 
                   sexually_transmitted_infectious_disease
                   tick-borne_infectious_disease
+
                   tick_borne_infectious_disease
 
                   zoonotic_infectious_disease
 
                   zoonotic_infectious_disease
 
Attribute_of      Gene_by_biology    ?Gene      XREF  Experimental_model
 
Attribute_of      Gene_by_biology    ?Gene      XREF  Experimental_model
Line 189: Line 211:
 
                   WBProcess  ?WBProcess  XREF  DO_term
 
                   WBProcess  ?WBProcess  XREF  DO_term
 
                   Reference  ?Paper      XREF  DO_term  
 
                   Reference  ?Paper      XREF  DO_term  
Index              Ancestor  ?DO_term  XREF Descendent     
 
                  Descendent ?DO_term  XREF Ancestor 
 
 
Version            UNIQUE Text
 
Version            UNIQUE Text
 
</pre>
 
</pre>
  
 +
Other class modifications because of XREF/References
  
Possible changes
+
<pre>
 +
?Phenotype
 +
DO_term ?DO_term #Evidence
 +
 +
?WBProcess
 +
DO_term ?DO_term #Evidence
  
1) further eradication of the species from the tags in favor of a ?Species tag in the model branch
+
?Paper
 +
DO_term ?DO_term #Evidence
 +
</pre>
  
2) Couple of issues with broken tag structures/Models
+
Possible changes
  
3) Data storage questions
+
Most previously mentioned have been resolved
  Created_by Text
 
  Version Text
 
  Subset Text
 
  Creation_date Text -> Creation_date DateType
 
  
4) Possible redundancy between Index and Parent/Child
+
1) Possible redundancy between Index and Parent/Child
  
5) Standardisation of Parent/Child as there are multiple ontologies in the schema, all doing things differently.
+
Postponed tags:
 +
<pre>
 +
Index              Ancestor  ?DO_term  XREF Descendent     
 +
                  Descendent ?DO_term  XREF Ancestor
 +
</pre>
  
 
=== * Condition Class - Wen ===
 
=== * Condition Class - Wen ===
Line 220: Line 248:
 
Temperature Float
 
Temperature Float
 
</pre>
 
</pre>
 +
 +
 +
&copy; WormBase 2012

Latest revision as of 15:31, 14 December 2012

* General housekeeping - ?Accession class

?Accession class retirement phase 1

  • Remove simple examples of:
Database ?Database ?Database_field ?Accession 

from the models file in favor of:

?Database_field Text

This will significantly reduce the numbers of empty objects in the ?Accession class

* Transposon gene reactivation - Paul

Poor nomenclature in ?Gene history and Event

Model changes to ?Gene and ?Transposon

I'd like to change tags in the ?Gene model and #Gene_history_action hash.

#Gene_history_action Event  Created
                            Killed
                            Made_into_transposon        // for CDSs that become Transposon CDSs - no longer count as Live Gene

#Gene_history_action Event  Created
                            Killed
                            Transposon_in_origin        // for genes that are identified as having CDSs/Proteins that are Transposon in origin

?Gene
History Version_change Int UNIQUE DateType UNIQUE ?Person #Gene_history_action
    Made_into_transposon

?Gene
History Version_change Int UNIQUE DateType UNIQUE ?Person #Gene_history_action
    Transposon_in_origin

As this was poorly thought out and is misleading (gets a high status position in the Overview widget which is good, but tag name has issues)

Gene::Transposon

I'd also like to make a connection between Gene and Transposon

?Gene
        Allele ?Variation XREF Gene #Evidence
    Corresponding_transposon ?Transposon XREF Gene #Evidence

?Transposon
    Corresponding_gene ?Gene XREF Corresponding_transposon #Evidence 

#Gene_history_action

I just spotted an additional change the the #Gene_history_action model that might be needed?

2 options.

1) Just need to remove the killed from the "Suppressed genes. A use case would be the Transposon_in_origin genes where the scripts would only used the "Transposon_in_origin tag in the History tag structure.

2) Talking with Mary Ann, even though this is overkill, it fills an omission whereby all tags in the #Gene_history_action can be represented as a rooted tag within the object and for consistency this should be added.

addition of:

#Gene_history_action Event Suppressed 

* Species/Sequence collection - Kevin

When we added the Assembly Sequence_collection to the ?Species class, we made it unique, in the assumption that there would only ever be one current assembly for each species.

Turns out we were wrong. So I therefore propose to remove the UNIQUE.

* Strain modifications

?Strain - Michael

I would like to propose removal of the unused tags:

Reference_strain - changes to the Sequence_collection class we would be able to identify a reference strain.
Males - has been superseded by the phenotype ontology

?Strain - Mary Ann

Sample_history

renamed to

Strain_history

M-A Felix wants this renamed and as of now it has only been used once and is not displayed on the site.

?Strain - Mary Ann

?Strain
     Isolation GPS UNIQUE Float UNIQUE Float #Position_confidence
               Elevation UNIQUE Unique Float #Position_confidence

#Position_confidence Exact
                     Approximate
                     Inferred_from_GPS


Michael suggested a UNIQUE Fload after the Approximate tag to specify the estimated error range (+-360.0) but this might be overkill.

* ?Drug_resistance

Mary Ann proposed a model but upon internal discussion it seemed that this might be better placed with Karen at Caltech.

* ?Variation and ?Feature mapping changes

Mapping

This models update unifies the mapping of these 2 classes and provides a second (better) method for mapping genomic features which we will be using in the near future.

It has been pointed out that 1bp features cannot infer strand solely based on the 2 stored co-ordinates so storing the source strand where available might be useful. As the majority of this data will be coming from gff, we will adopt the gff strand convention (+-.)

?Variation
SMap 	S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
    	Sequence_details	Flanking_sequences UNIQUE Text UNIQUE Text  //No change
				Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
				Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int UNIQUE Text #Evidence //data provided by paper/submitter/project

?Feature
SMap 	S_parent UNIQUE Sequence UNIQUE ?Sequence XREF Allele    //Data removed from the primary database
    	Sequence_details	Flanking_sequences UNIQUE Text UNIQUE Text  //Removal of leading ?Sequence connection
				Mapping_target UNIQUE ?Sequence //addition to store the Sequence the object should map to.
				Source_location UNIQUE Int UNIQUE ?Sequence UNIQUE Int UNIQUE Int UNIQUE Text #Evidence //data provided by paper/submitter/project

Gary W.

Would also like to add a Text description of the type of score followed by an #Evidence hash to the ?Feature model to follow Score

?Feature Score Float Text #Evidence

* ?Gene and Disease Ontology - Ranjana

Proposed and working through Ideas with Hinxton.

Background

?Gene additions

?Gene
DB_info  Database ?Database ?Database_field Text//for pointing to OMIM ortholog and disease
Disease_info 	Experimental_model ?DO_term XREF Gene_by_biology   ?Species   #Evidence	            
             	Potential_model	   ?DO_term XREF Gene_by_orthology ?Species #Evidence
             	Disease_relevance  ?Text ?Species #Evidence

?DO_term - New class

?DO_term 
Name  UNIQUE               ?Text
Status UNIQUE              Valid
                           Obsolete
Alternate_id               ?Text
Definition UNIQUE          ?Text
Comment                    Text
Synonymn           Broad   ?Text
                   Exact   ?Text
                   Narrow  ?Text
                   Related ?Text
Parent             Is_a  	?DO_term  XREF  Is
Child              Is 	        ?DO_term  XREF  Is_a 
DB_info            Database     ?Database  ?Database_field   Text              
Type               GOLD                   
                   gram_negative_bacterial_infectious_disease  //changed - for _ in tag names.
                   gram_positive_bacterial_infectious_disease
                   sexually_transmitted_infectious_disease
                   tick_borne_infectious_disease
                   zoonotic_infectious_disease
Attribute_of       Gene_by_biology    ?Gene       XREF   Experimental_model
                   Gene_by_orthology  ?Gene       XREF   Potential_model
                   Phenotype  ?Phenotype  XREF   DO_term
                   WBProcess  ?WBProcess  XREF   DO_term
                   Reference  ?Paper      XREF   DO_term 
Version            UNIQUE Text

Other class modifications because of XREF/References

?Phenotype
DO_term ?DO_term #Evidence
 
?WBProcess
DO_term ?DO_term #Evidence

?Paper 
DO_term ?DO_term #Evidence

Possible changes

Most previously mentioned have been resolved

1) Possible redundancy between Index and Parent/Child

Postponed tags:

Index              Ancestor   ?DO_term   XREF Descendent      
                   Descendent ?DO_term   XREF Ancestor 

* Condition Class - Wen

Change the temparature Int to a Float

Temperature Int

Temperature Float


© WormBase 2012