Difference between revisions of "Evidence Code Ontology"

From WormBaseWiki
Jump to navigationJump to search
Line 35: Line 35:
  
 
= Parsing Script for ?ECO_term Model  =
 
= Parsing Script for ?ECO_term Model  =
*The parsing script to generate the eco_terms.ace file is on tazendra here:
+
* The parsing script to generate the eco_terms.ace file is on tazendra here:
**/home/acedb/kimberly/citace_upload/eco/ontology2ace/eco_obo_to_eco_ace.pl
+
** /home/acedb/kimberly/citace_upload/eco/ontology2ace/eco_obo_to_eco_ace.pl
  
*Input file:
+
* Input file:
**https://raw.githubusercontent.com/evidenceontology/evidenceontology/master/eco.obo
+
** https://raw.githubusercontent.com/evidenceontology/evidenceontology/master/eco.obo
  
*Output file:
+
* Output file:
**eco_terms.ace
+
** eco_terms.ace
  
*There are two types of entries in ro.obo
+
* There are two types of entries in ro.obo
**Term
+
** Term
**Typedef
+
** Typedef
*At the moment, we include Term and Typedef entries according to the following criteria:
+
* We'll only include the Term entries
**Include term where id namespace = BFO, RO, lower case text with no namespace: (e.g. results_in_acquisition_of_features_of)
+
 
**Skip term where id namespace = CARO, CL, ENVO, GO, ObsoleteClass, PATO
+
== ECO obo file to WB ace file mappings ==
*Note that the terms we skip are used in the OWL representation of the Relations Ontology, but not in the OBO representation
 
**For the purposes of how the Relations Ontology will initially be used in WormBase, this filtering step should be fine
 
  
 
*Running the script:
 
*Running the script:

Revision as of 13:30, 1 June 2020

Overview

  • The Evidence and Conclusion Ontology will be incorporated into a future release of WormBase, aiming for WS278, and is proposed to eventually fully replace ?GO_code.

?ECO_term Model

 ?ECO_term Name UNIQUE ?Text
          Status UNIQUE Valid
                        Obsolete
          Alt_id ?Text
          Definition UNIQUE ?Text
          Synonym Broad ?Text
                  Exact ?Text
                  Narrow ?Text
                  Related ?Text
          Child ?ECO_term XREF Parent
          Parent ?ECO_term XREF Child
          Version UNIQUE Text

Other models affected for WS278

  • Will replace ?GO_code with ?ECO_term in:
    • ?GO_annotation
    • ?Phenotype
    • ?Disease_model_annotation (will eventually replace ?GO_code with ?ECO_term)
  • Will remove ?GO_code in:
    • ?Cell

Additional model cleanup? Are these tags being used at all?

  1. ?Gene - remove ?GO_term tag and associated info?
  2. ?Sequence - remove ?GO_term tag and associated info?
  3. ?CDS - remove ?GO_term tag and associated info?
  4. ?Transcript - remove ?GO_term tag and associated info?


Parsing Script for ?ECO_term Model

  • The parsing script to generate the eco_terms.ace file is on tazendra here:
    • /home/acedb/kimberly/citace_upload/eco/ontology2ace/eco_obo_to_eco_ace.pl
  • Output file:
    • eco_terms.ace
  • There are two types of entries in ro.obo
    • Term
    • Typedef
  • We'll only include the Term entries

ECO obo file to WB ace file mappings

  • Running the script:
  1. In the /home/acedb/kimberly/citace_upload/ro/ontology2ace directory on tazendra, use the wget command and the above URL to download the ro.obo input file
  2. Greek characters used in some of the definitions for the RO terms (as an example, see id: BFO:0000063 name: precedes) need to be converted to text that ACeDB can render by doing the following (note that we tried to incorporate this as part of the script, but couldn't figure out a way to correctly recognize these characters programtically):
  • In one terminal window, open the ro.obo file in a text editor
  • In a second terminal window, open the convertGreekVim file in a text editor
  • Globally replace the lower case alpha and omega symbols in ro.obo by copying and pasting the appropriate global replacement commands from convertGreekVim, e.g. :%s/α/alpha/g
    • Sanity check:
      • 26 substitutions of lower-case alpha on 10 lines
      • 24 substitutions of lower-case omega on 9 lines
  • Once the Greek characters have been replaced, run the parsing script and change the name of the output file from ro_terms.ace to ro_terms_WSnnn.ace