Difference between revisions of "Evidence Code Ontology"

From WormBaseWiki
Jump to navigationJump to search
(Created page with "= Overview = *[https://github.com/evidenceontology/evidenceontology The Evidence and Conclusion Ontology] is used in WormBase to record evidence used for assertions in curati...")
 
 
(54 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
*[https://github.com/evidenceontology/evidenceontology The Evidence and Conclusion Ontology] is used in WormBase to record evidence used for assertions in curation data models.
 
*[https://github.com/evidenceontology/evidenceontology The Evidence and Conclusion Ontology] is used in WormBase to record evidence used for assertions in curation data models.
  
*The Evidence and Conclusion Ontology will be incorporated into a future release of WormBase, aiming for WS278.
+
*The Evidence and Conclusion Ontology will be incorporated into a future release of WormBase, aiming for WS278, and is proposed to eventually fully replace ?GO_code.
  
 
= ?ECO_term Model =
 
= ?ECO_term Model =
Line 15: Line 15:
 
                   Narrow ?Text
 
                   Narrow ?Text
 
                   Related ?Text
 
                   Related ?Text
           Child Instance ?ECO_term XREF Instance_of
+
           Child ?ECO_term XREF Parent
           Parent Instance_of ?ECO_term XREF Instance
+
           Parent ?ECO_term XREF Child
          Attribute_of GO_annotation ?GO_annotation XREF ?ECO_code
 
          Index Ancestor ?ECO_term XREF Descendant
 
                Descendant ?ECO_term XREF Ancestor
 
 
           Version UNIQUE Text
 
           Version UNIQUE Text
 +
 +
= Other models affected for WS278 =
 +
* Will replace ?GO_code with ?ECO_term in:
 +
** ?GO_annotation
 +
** ?Phenotype
 +
** ?Disease_model_annotation (will eventually replace ?GO_code with ?ECO_term)
 +
* Will remove ?GO_code in:
 +
** ?Cell
 +
 +
= Additional model cleanup? Are these tags being used at all? =
 +
# ?Gene - remove ?GO_term tag and associated info?
 +
# ?Sequence - remove ?GO_term tag and associated info?
 +
# ?CDS - remove ?GO_term tag and associated info?
 +
# ?Transcript - remove ?GO_term tag and associated info?
 +
  
 
= Parsing Script for ?ECO_term Model  =
 
= Parsing Script for ?ECO_term Model  =
'''TO BE ADDED'''information below is wrt RO parsing
+
* The parsing script to generate the eco_terms.ace file is on tazendra here:
*The parsing script to generate the ro_terms.ace file is on tazendra here:
+
** /home/acedb/kimberly/citace_upload/eco/ontology2ace/eco_obo_to_eco_ace.pl
**/home/acedb/kimberly/citace_upload/ro/ontology2ace/ro_obo_to_ro_ace.pl
+
 
 +
* Input file:
 +
** https://raw.githubusercontent.com/evidenceontology/evidenceontology/master/eco.obo
 +
 
 +
* Output file:
 +
** eco_terms.ace
 +
 
 +
* There are two types of entries in ro.obo
 +
** Term
 +
** Typedef
 +
* We'll only include the Term entries
 +
 
 +
== ECO obo file to WB ace file mappings ==
 +
 
 +
{| cellspacing="2" border="1"
 +
|-
 +
! ace field
 +
! ECO field
 +
! Action
 +
! Example
 +
 
 +
|-
 +
| ECO_term || id: || Add ECO id as is || ECO:0000052
 +
|-
 +
| Name|| name: || Add name as is (but see below) || suppressor/enhancer interaction phenotypic evidence
 +
|-
 +
| Status || is_obsolete: || Populate with 'Valid' unless 'is_obsolete: true', then populate with Obsolete* || Valid
 +
|-
 +
| Alt_id || alt_id: || Add alt_id as is || ECO:0005018
 +
|-
 +
| Definition || def: || Add text as is || A type of affinity evidence resulting from quantitation of the analyte which depends on the reaction of an antigen (analyte) and an antibody. [ERO:0001362, url:http\://www.ncbi.nlm.nih.gov/books/NBK21589/\,http\://cores.ucsf.edu/protein-assay.html]
 +
|-
 +
| Synonym || synonym || Remove quotation marks; populate type from bracketed text; add remaining text as is  || Exact gain-of-function mutant phenotype evidence
 +
|-
 +
| Child || n/a || n/a  || n/a
 +
|-
 +
| Parent || is_a: || Add is_a parent id as is || ECO:0000015
 +
|-
 +
| Version || data-version: || Add text following 'data-version:' || releases/2020-05-15
 +
|-
 +
|}
 +
 
 +
 
 +
* Notes and Questions:
 +
** For names, may need to escape forward slashes
 +
** There are 40 obsolete ECO terms in the releases/2020-05-15
 +
** Include bracketed evidence on definitions? We aren't doing this for GO, but we probably should.  Would we need to extend the end quotation until after the references?
  
*Input file:
+
= Example obo and ace entries =
**Downloaded, manually processed version (see below) of https://raw.githubusercontent.com/oborel/obo-relations/master/ro.obo
+
== obo entry for IMP ==
 +
*[Term]
 +
*id: ECO:0000315
 +
*name: mutant phenotype evidence used in manual assertion
 +
*def: "A type of mutant phenotype evidence that is used in a manual assertion." [ECO:MCC]
 +
*subset: valid_with_chemical_entity
 +
*subset: valid_with_gene
 +
*subset: valid_with_protein
 +
*subset: valid_with_protein_complex
 +
*synonym: "IMP" EXACT [GOECO:IMP]
 +
*synonym: "inferred from mutant phenotype" EXACT [GOECO:IMP]
 +
*xref: GOECO:IMP "inferred from mutant phenotype"
 +
*is_a: ECO:0000015 ! mutant phenotype evidence
 +
*is_a: ECO:0007634 {is_inferred="true"} ! experimental phenotypic evidence used in manual assertion
 +
*intersection_of: ECO:0000015 ! mutant phenotype evidence
 +
*intersection_of: used_in ECO:0000218 ! manual assertion
 +
*property_value: seeAlso http://geneontology.org/page/imp-inferred-mutant-phenotype xsd:string
 +
*created_by: mchibucos
 +
*creation_date: 2011-10-28T05:12:49Z
  
*Output file:
+
== ace entry for IMP ==
**ro_terms.ace
+
* ECO_term : "ECO:0000315"
 +
* Name    "mutant phenotype evidence used in manual assertion"
 +
* Status  "Valid"
 +
* Definition  "A type of mutant phenotype evidence that is used in a manual assertion."
 +
* Synonym Exact "IMP"
 +
* Synonym Exact "inferred from mutant phenotype"
 +
* Parent "ECO:0000015"
 +
* Parent "ECO:0007634"
 +
* Version "Relations Ontology releases/2020-02-26"
  
*There are two types of entries in ro.obo
+
= Changes to ?GO_annotation parsing scripts =
**Term
 
**Typedef
 
*At the moment, we include Term and Typedef entries according to the following criteria:
 
**Include term where id namespace = BFO, RO, lower case text with no namespace: (e.g. results_in_acquisition_of_features_of)
 
**Skip term where id namespace = CARO, CL, ENVO, GO, ObsoleteClass, PATO
 
*Note that the terms we skip are used in the OWL representation of the Relations Ontology, but not in the OBO representation
 
**For the purposes of how the Relations Ontology will initially be used in WormBase, this filtering step should be fine
 
  
*Running the script:
+
== Protein2GO GPAD ==
#In the /home/acedb/kimberly/citace_upload/ro/ontology2ace directory on tazendra, use the wget command and the above URL to download the ro.obo input file
+
* Latest parsing script: /home/acedb/kimberly/citace_upload/go/gpad2ace/2020_July
#Greek characters used in some of the definitions for the RO terms (as an example, see id: BFO:0000063 name: precedes) need to be converted to text that ACeDB can render by doing the following (note that we tried to incorporate this as part of the script, but couldn't figure out a way to correctly recognize these characters programtically):
+
* Keep populating GO_code tag (lines 159 and 160)
 +
* Add populating ECO_term tag
 +
** ECO_term is in $evicode (Column 7) in GPAD input file  
 +
** Populate value in .ace file as is, e.g. ECO:0000315
 +
*** ECO_term "ECO:0000315"
  
*In one terminal window, open the ro.obo file in a text editor
+
== Noctua GPAD ==
*In a second terminal window, open the convertGreekVim file in a text editor
+
* Latest parsing script: /home/acedb/kimberly/citace_upload/go/gocam2ace/2020_July
*Globally replace the lower case alpha and omega symbols in ro.obo by copying and pasting the appropriate global replacement commands from convertGreekVim, e.g. :%s/α/alpha/g
+
* Keep populating GO_code tag (lines 153 and 154)
**Sanity check:
+
* Add populating ECO_term tag
***26 substitutions of lower-case alpha on 10 lines
+
** ECO_term is in $evicode (Column 7) in GPAD input file  
***24 substitutions of lower-case omega on 9 lines
+
** Populate value in .ace file as is, e.g. ECO:0000315
 +
*** ECO_term "ECO:0000315"
  
*Once the Greek characters have been replaced, run the parsing script and change the name of the output file from ro_terms.ace to ro_terms_WSnnn.ace
+
== OA Annotations ==
 +
* Latest parsing script: /home/postgres/work/citace_upload/go_curation
 +
* Keep populating GO_code tag (line 83)
 +
* Add populating ECO_term tag
 +
** Use value in goinference
 +
** Map to ECO term (there is a mapping in /home/acedb/kimberly/citace_upload/go/gocam2ace/2020_July in lines 304-328)
 +
** Populate value in .ace file from mapping, e.g. ECO:0000315
 +
*** ECO_term "ECO:0000315"
 +
** List of mappings to use (from gocam2ace parsing script):
 +
*** $ecoToGoCode{"ECO:0000250"} = "ISS";
 +
*** $ecoToGoCode{"ECO:0000270"} = "IEP";
 +
*** $ecoToGoCode{"ECO:0000304"} = "TAS";
 +
*** $ecoToGoCode{"ECO:0000303"} = "NAS";
 +
*** $ecoToGoCode{"ECO:0000307"} = "ND";
 +
*** $ecoToGoCode{"ECO:0000305"} = "IC";
 +
*** $ecoToGoCode{"ECO:0000314"} = "IDA";
 +
*** $ecoToGoCode{"ECO:0000315"} = "IMP";
 +
*** $ecoToGoCode{"ECO:0000316"} = "IGI";
 +
*** $ecoToGoCode{"ECO:0000353"} = "IPI";

Latest revision as of 17:56, 15 June 2020

Overview

  • The Evidence and Conclusion Ontology will be incorporated into a future release of WormBase, aiming for WS278, and is proposed to eventually fully replace ?GO_code.

?ECO_term Model

 ?ECO_term Name UNIQUE ?Text
          Status UNIQUE Valid
                        Obsolete
          Alt_id ?Text
          Definition UNIQUE ?Text
          Synonym Broad ?Text
                  Exact ?Text
                  Narrow ?Text
                  Related ?Text
          Child ?ECO_term XREF Parent
          Parent ?ECO_term XREF Child
          Version UNIQUE Text

Other models affected for WS278

  • Will replace ?GO_code with ?ECO_term in:
    • ?GO_annotation
    • ?Phenotype
    • ?Disease_model_annotation (will eventually replace ?GO_code with ?ECO_term)
  • Will remove ?GO_code in:
    • ?Cell

Additional model cleanup? Are these tags being used at all?

  1. ?Gene - remove ?GO_term tag and associated info?
  2. ?Sequence - remove ?GO_term tag and associated info?
  3. ?CDS - remove ?GO_term tag and associated info?
  4. ?Transcript - remove ?GO_term tag and associated info?


Parsing Script for ?ECO_term Model

  • The parsing script to generate the eco_terms.ace file is on tazendra here:
    • /home/acedb/kimberly/citace_upload/eco/ontology2ace/eco_obo_to_eco_ace.pl
  • Output file:
    • eco_terms.ace
  • There are two types of entries in ro.obo
    • Term
    • Typedef
  • We'll only include the Term entries

ECO obo file to WB ace file mappings

ace field ECO field Action Example
ECO_term id: Add ECO id as is ECO:0000052
Name name: Add name as is (but see below) suppressor/enhancer interaction phenotypic evidence
Status is_obsolete: Populate with 'Valid' unless 'is_obsolete: true', then populate with Obsolete* Valid
Alt_id alt_id: Add alt_id as is ECO:0005018
Definition def: Add text as is A type of affinity evidence resulting from quantitation of the analyte which depends on the reaction of an antigen (analyte) and an antibody. [ERO:0001362, url:http\://www.ncbi.nlm.nih.gov/books/NBK21589/\,http\://cores.ucsf.edu/protein-assay.html]
Synonym synonym Remove quotation marks; populate type from bracketed text; add remaining text as is Exact gain-of-function mutant phenotype evidence
Child n/a n/a n/a
Parent is_a: Add is_a parent id as is ECO:0000015
Version data-version: Add text following 'data-version:' releases/2020-05-15


  • Notes and Questions:
    • For names, may need to escape forward slashes
    • There are 40 obsolete ECO terms in the releases/2020-05-15
    • Include bracketed evidence on definitions? We aren't doing this for GO, but we probably should. Would we need to extend the end quotation until after the references?

Example obo and ace entries

obo entry for IMP

  • [Term]
  • id: ECO:0000315
  • name: mutant phenotype evidence used in manual assertion
  • def: "A type of mutant phenotype evidence that is used in a manual assertion." [ECO:MCC]
  • subset: valid_with_chemical_entity
  • subset: valid_with_gene
  • subset: valid_with_protein
  • subset: valid_with_protein_complex
  • synonym: "IMP" EXACT [GOECO:IMP]
  • synonym: "inferred from mutant phenotype" EXACT [GOECO:IMP]
  • xref: GOECO:IMP "inferred from mutant phenotype"
  • is_a: ECO:0000015 ! mutant phenotype evidence
  • is_a: ECO:0007634 {is_inferred="true"} ! experimental phenotypic evidence used in manual assertion
  • intersection_of: ECO:0000015 ! mutant phenotype evidence
  • intersection_of: used_in ECO:0000218 ! manual assertion
  • property_value: seeAlso http://geneontology.org/page/imp-inferred-mutant-phenotype xsd:string
  • created_by: mchibucos
  • creation_date: 2011-10-28T05:12:49Z

ace entry for IMP

  • ECO_term : "ECO:0000315"
  • Name "mutant phenotype evidence used in manual assertion"
  • Status "Valid"
  • Definition "A type of mutant phenotype evidence that is used in a manual assertion."
  • Synonym Exact "IMP"
  • Synonym Exact "inferred from mutant phenotype"
  • Parent "ECO:0000015"
  • Parent "ECO:0007634"
  • Version "Relations Ontology releases/2020-02-26"

Changes to ?GO_annotation parsing scripts

Protein2GO GPAD

  • Latest parsing script: /home/acedb/kimberly/citace_upload/go/gpad2ace/2020_July
  • Keep populating GO_code tag (lines 159 and 160)
  • Add populating ECO_term tag
    • ECO_term is in $evicode (Column 7) in GPAD input file
    • Populate value in .ace file as is, e.g. ECO:0000315
      • ECO_term "ECO:0000315"

Noctua GPAD

  • Latest parsing script: /home/acedb/kimberly/citace_upload/go/gocam2ace/2020_July
  • Keep populating GO_code tag (lines 153 and 154)
  • Add populating ECO_term tag
    • ECO_term is in $evicode (Column 7) in GPAD input file
    • Populate value in .ace file as is, e.g. ECO:0000315
      • ECO_term "ECO:0000315"

OA Annotations

  • Latest parsing script: /home/postgres/work/citace_upload/go_curation
  • Keep populating GO_code tag (line 83)
  • Add populating ECO_term tag
    • Use value in goinference
    • Map to ECO term (there is a mapping in /home/acedb/kimberly/citace_upload/go/gocam2ace/2020_July in lines 304-328)
    • Populate value in .ace file from mapping, e.g. ECO:0000315
      • ECO_term "ECO:0000315"
    • List of mappings to use (from gocam2ace parsing script):
      • $ecoToGoCode{"ECO:0000250"} = "ISS";
      • $ecoToGoCode{"ECO:0000270"} = "IEP";
      • $ecoToGoCode{"ECO:0000304"} = "TAS";
      • $ecoToGoCode{"ECO:0000303"} = "NAS";
      • $ecoToGoCode{"ECO:0000307"} = "ND";
      • $ecoToGoCode{"ECO:0000305"} = "IC";
      • $ecoToGoCode{"ECO:0000314"} = "IDA";
      • $ecoToGoCode{"ECO:0000315"} = "IMP";
      • $ecoToGoCode{"ECO:0000316"} = "IGI";
      • $ecoToGoCode{"ECO:0000353"} = "IPI";