Difference between revisions of "Genotype"

From WormBaseWiki
Jump to navigationJump to search
(32 intermediate revisions by 3 users not shown)
Line 1: Line 1:
=Genotype as an object=
+
 
''This page is meant as a chalkboard to through around ideas and eventually formulate a plan for curating to genotypes and strains rather than individual variations or single genes.''<br>
+
==Rationale==
Caltech Group Meeting [http://wiki.wormbase.org/index.php/WormBase-Caltech_Weekly_Calls#May_19.2C_2016 19-May-2016 notes] <br>
+
 
Objectifying Genotypes[edit]
+
To enable curators (disease curators and then maybe phenotype curators) to annotate directly to entire genotypes, the ?Genotype class has been proposed. The decision was made that any ?Genotype object should reflect the entire genotype of a particular worm or strain of worms, to the extent that it is known, rather than partial genotypes. The initial data model allows for direct associations to disease model annotations (curated in the disease OA), whereby genotype associations to diseases (DO terms) would then be made during the Postgres curation database dump.
Increasingly, we are encountering needs to objectify genotypes
+
 
We have strain objects, although strains are not always specified in papers
+
==Issues==
We could potentially create strain objects that are named according to a paper ID
+
 
This would enable us to capture certain phenotype and genetic interaction annotations
+
==Action items==
We need to consider a strain's phenotype in relation to a control strain
+
* For future:
We already annotate phenotypes to transgenes while specifying the gene (in the transgene) that causes the phenotype
+
** Automatically generate Genotype names according to [https://wormbase.org/about/userguide/nomenclature#c--10 nomenclature guidelines] based on "Genotype_component" tag entries
It will be good for WB cuartors to hear from other MODs about how they handle this
+
*** Note: zygosity will not be accounted for unless zygosity is added back in to model
We can reconvene after today's phenotype call and focus on this discussion for future cross-MOD calls
+
** Build database pipeline scripts to add automatically inferred gene associations from variation-to-gene mapping pipeline output
 +
** Add in strain references in ?Genotype model and either (A) co-opt existing "Genotype" tag in ?Strain model to point to ?Genotype objects instead of free text (would require transforming all existing free text entries into ?Genotype objects) and/or (B) supplement ?Strain model with additional "Genotype" tag to point to ?Genotype objects while possibly changing "Genotype" tag for free text to "Genotype_text"
 +
** If necessary (if use cases exist), add back in zygosity and/or maternal/paternal genotype references
  
 
== ?Genotype model ==
 
== ?Genotype model ==
Line 17: Line 19:
  
 
<pre>
 
<pre>
?Genotype Genotype_name  UNIQUE  ?Text //  e.g. “unc-1(e103);unc-2(e234)”
+
?Genotype Genotype_name  UNIQUE  ?Text //  e.g. “unc-1(e103);unc-2(e234)”; eventually to be automatically generated
 +
                Genotype_synonym  ?Text    // To capture literal representations in the literature
 
                 Genotype_component Gene ?Gene   XREF  Component_of_genotype  #Evidence
 
                 Genotype_component Gene ?Gene   XREF  Component_of_genotype  #Evidence
                                         Variation ?Variation XREF Component_of_genotype UNIQUE Text // Zygosity
+
                                         Variation ?Variation XREF Component_of_genotype
                        Rearrangement ?Rearrangement  XREF Component_of_genotype UNIQUE Text // Zygosity
+
                        Rearrangement ?Rearrangement  XREF Component_of_genotype
                        Transgene ?Transgene  XREF  Component_of_genotype UNIQUE Text // Zygosity
+
                        Transgene ?Transgene  XREF  Component_of_genotype
                        // Zygosity text entry should be one of the following:
 
                        // Homozygous, Heterozygous_with_wildtype, or Heteroallelic_combination
 
 
                        Other_component UNIQUE  ?Text  // Free text components including RNAi
 
                        Other_component UNIQUE  ?Text  // Free text components including RNAi
Is_genotype_for_strain ?Strain XREF Genotype
+
Disease_info  Models_disease  ?DO_term  XREF  Disease_model_genotype
Has_background_strain  UNIQUE  ?Strain  XREF  Is_background_for // Only use when NOT N2
+
      Modifies_disease  ?DO_term  XREF  Disease_modifier_genotype
Relation_to_other_genotypes   Has_maternal_genotype UNIQUE ?Genotype XREF Is_maternal_genotype_for
+
      Models_disease_in_annotation ?Disease_model_annotation  XREF  Genotype
  Has_paternal_genotype UNIQUE ?Genotype XREF Is_paternal_genotype_for
+
          Modifies_disease_in_annotation ?Disease_model_annotation  XREF  Modifier_genotype
    Is_maternal_genotype_for ?Genotype XREF Has_maternal_genotype
+
Species UNIQUE ?Species
  Is_paternal_genotype_for ?Genotype XREF Has_paternal_genotype
+
Remark ?Text                            
Disease_info  Models_disease  ?DO_term  XREF  Disease_model_genotype
+
Reference ?Paper XREF Genotype
  Modifies_disease  ?DO_term  XREF  Disease_modifier_genotype
 
  Disease_model_annotation  Model_genotype ?Disease_model_annotation  XREF  Genotype
 
      Modifier_genotype ?Disease_model_annotation  XREF  Modifier_genotype
 
Species UNIQUE ?Species
 
Remark ?Text                            
 
Reference ?Paper XREF Genotype
 
 
</pre>
 
</pre>
  
 +
== Model issues ==
 +
* Model proposal, including proposed changes to other classes, on [https://docs.google.com/document/d/19hP9r6BpPW3FSAeC_67FNyNq58NGp4eaXBT42Ch3gDE/edit?usp=sharing this Google Doc]
 +
* Database build will need to populate the "Gene" tag with genes inferred from the variation-to-gene mapping pipeline, if they are not already manually populated by the curator/Postgres
 +
** Genes inferred automatically by the build process should get an #Evidence hash entry of "Inferred_automatically"
 +
* The "Gene" tag will get automatically populated by the OA dumper for ?Rearrangement and ?Transgene objects
 +
** Genes referenced in ?Rearrangement objects' "Gene_inside" tag will populate the "Gene" tag in the ?Genotype object
 +
** Genes referenced in a ?Transgene object's corresponding ?Construct object's (from "Construct" and "Coinjection" tags) "Driven_by_gene", "Gene" and "3_UTR" tags will populate the "Gene" tag in the ?Genotype object
 +
* Disease assertions (and relevant paper connections) will be dumped by the Disease Model Annotation OA
  
  
===Rationale===
+
== Genotype Postgres Tables ==
 
+
*gno_identifier
===Issues===
+
*gno_curator
 +
*gno_name
 +
*gno_synonym
 +
*gno_gene
 +
*gno_variation
 +
*gno_rearrangement
 +
*gno_transgene
 +
*gno_othercomp
 +
*gno_species
 +
*gno_remark
 +
*gno_paper
 +
*gno_nodump
  
===Action items===
+
== Genotype OA ==
 +
*pgid - no postgres table - the postgres ID - NOT DUMPED
 +
*ID - gno_identifier - Genotype primary ID #Note: This is automatically assigned (format: WBGenotype00000001)
 +
*Curator - gno_curator - NOT DUMPED - Curator (Dropdown)
 +
*Name - gno_name - Genotype_name - Big text
 +
*Synonym - gno_synonym - Genotype_synonym - Big Text (multiple entries bar "|" separated)
 +
*Gene - gno_gene - Gene - ?Gene (Multi-ontology)
 +
*Variation - gno_variation - Variation - ?Variation  (Multi-ontology)
 +
*Rearrangement - gno_rearrangement - Rearrangement - ?Rearrangement  (Multi-ontology)
 +
*Transgene - gno_transgene - Transgene - ?Transgene  (Multi-ontology)
 +
*Other Component - gno_othercomp - Other_component - Big text
 +
*Species - gno_species - Species - ?Species (Dropdown)
 +
*Remark - gno_remark - Remark - Big text
 +
*Paper - gno_paper - Reference - ?Paper (Multi-ontology)
 +
*NO DUMP - gno_nodump - NOT DUMPED - toggle

Revision as of 18:15, 14 May 2020

Rationale

To enable curators (disease curators and then maybe phenotype curators) to annotate directly to entire genotypes, the ?Genotype class has been proposed. The decision was made that any ?Genotype object should reflect the entire genotype of a particular worm or strain of worms, to the extent that it is known, rather than partial genotypes. The initial data model allows for direct associations to disease model annotations (curated in the disease OA), whereby genotype associations to diseases (DO terms) would then be made during the Postgres curation database dump.

Issues

Action items

  • For future:
    • Automatically generate Genotype names according to nomenclature guidelines based on "Genotype_component" tag entries
      • Note: zygosity will not be accounted for unless zygosity is added back in to model
    • Build database pipeline scripts to add automatically inferred gene associations from variation-to-gene mapping pipeline output
    • Add in strain references in ?Genotype model and either (A) co-opt existing "Genotype" tag in ?Strain model to point to ?Genotype objects instead of free text (would require transforming all existing free text entries into ?Genotype objects) and/or (B) supplement ?Strain model with additional "Genotype" tag to point to ?Genotype objects while possibly changing "Genotype" tag for free text to "Genotype_text"
    • If necessary (if use cases exist), add back in zygosity and/or maternal/paternal genotype references

?Genotype model

Initial proposal:

?Genotype	Genotype_name  UNIQUE  ?Text		//  e.g. “unc-1(e103);unc-2(e234)”; eventually to be automatically generated
                Genotype_synonym   ?Text    // To capture literal representations in the literature
                Genotype_component	Gene	?Gene	  XREF  Component_of_genotype  #Evidence
                                        Variation ?Variation	XREF	Component_of_genotype
		                        Rearrangement	?Rearrangement  XREF	Component_of_genotype
		                        Transgene	?Transgene  XREF  Component_of_genotype
		                        Other_component	UNIQUE  ?Text  // Free text components including RNAi
		Disease_info  Models_disease  ?DO_term  XREF  Disease_model_genotype	
			      Modifies_disease  ?DO_term  XREF  Disease_modifier_genotype
			      Models_disease_in_annotation  ?Disease_model_annotation  XREF  Genotype
    			      Modifies_disease_in_annotation  ?Disease_model_annotation  XREF  Modifier_genotype
		Species	UNIQUE ?Species
		Remark	?Text	                           
		Reference	?Paper	XREF	Genotype

Model issues

  • Model proposal, including proposed changes to other classes, on this Google Doc
  • Database build will need to populate the "Gene" tag with genes inferred from the variation-to-gene mapping pipeline, if they are not already manually populated by the curator/Postgres
    • Genes inferred automatically by the build process should get an #Evidence hash entry of "Inferred_automatically"
  • The "Gene" tag will get automatically populated by the OA dumper for ?Rearrangement and ?Transgene objects
    • Genes referenced in ?Rearrangement objects' "Gene_inside" tag will populate the "Gene" tag in the ?Genotype object
    • Genes referenced in a ?Transgene object's corresponding ?Construct object's (from "Construct" and "Coinjection" tags) "Driven_by_gene", "Gene" and "3_UTR" tags will populate the "Gene" tag in the ?Genotype object
  • Disease assertions (and relevant paper connections) will be dumped by the Disease Model Annotation OA


Genotype Postgres Tables

  • gno_identifier
  • gno_curator
  • gno_name
  • gno_synonym
  • gno_gene
  • gno_variation
  • gno_rearrangement
  • gno_transgene
  • gno_othercomp
  • gno_species
  • gno_remark
  • gno_paper
  • gno_nodump

Genotype OA

  • pgid - no postgres table - the postgres ID - NOT DUMPED
  • ID - gno_identifier - Genotype primary ID #Note: This is automatically assigned (format: WBGenotype00000001)
  • Curator - gno_curator - NOT DUMPED - Curator (Dropdown)
  • Name - gno_name - Genotype_name - Big text
  • Synonym - gno_synonym - Genotype_synonym - Big Text (multiple entries bar "|" separated)
  • Gene - gno_gene - Gene - ?Gene (Multi-ontology)
  • Variation - gno_variation - Variation - ?Variation (Multi-ontology)
  • Rearrangement - gno_rearrangement - Rearrangement - ?Rearrangement (Multi-ontology)
  • Transgene - gno_transgene - Transgene - ?Transgene (Multi-ontology)
  • Other Component - gno_othercomp - Other_component - Big text
  • Species - gno_species - Species - ?Species (Dropdown)
  • Remark - gno_remark - Remark - Big text
  • Paper - gno_paper - Reference - ?Paper (Multi-ontology)
  • NO DUMP - gno_nodump - NOT DUMPED - toggle