Difference between revisions of "Relations Ontology"

From WormBaseWiki
Jump to: navigation, search
m (Overview)
m (Parsing Script for ?RO_term Model)
 
(14 intermediate revisions by the same user not shown)
Line 22: Line 22:
 
           Version UNIQUE Text
 
           Version UNIQUE Text
  
= Parsing Scripts for ?RO_term Model  =
+
= Parsing Script for ?RO_term Model  =
?RO_term class
+
*The parsing script to generate the ro_terms.ace file is on tazendra here:
 +
**/home/acedb/kimberly/citace_upload/ro/ontology2ace/ro_obo_to_ro_ace.pl
  
.ace parsing script for ?GO_term that can serve as a template
+
*Input file:
(See /home/acedb/kimberly/citace_upload/go/ontology2ace/go_obo2ace/go_obo_to_go_ace.pl)
+
**Downloaded, manually processed version (see below) of https://raw.githubusercontent.com/oborel/obo-relations/master/ro.obo
  
Output file:
+
*Output file:
RO_terms.ace
+
**ro_terms.ace
  
RO obo file location for download:
+
*There are two types of entries in ro.obo
https://raw.githubusercontent.com/oborel/obo-relations/master/ro.obo
+
**Term
 +
**Typedef
 +
*At the moment, we include Term and Typedef entries according to the following criteria:
 +
**Include term where id namespace = BFO, RO, lower case text with no namespace: (e.g. results_in_acquisition_of_features_of)
 +
**Skip term where id namespace = CARO, CL, ENVO, GO, ObsoleteClass, PATO
 +
*Note that the terms we skip are used in the OWL representation of the Relations Ontology, but not in the OBO representation
 +
**For the purposes of how the Relations Ontology will initially be used in WormBase, this filtering step should be fine
  
Right now, we will parse the obo file; the day may come, though, when we will want/need to parse the owl file.
+
*Running the script:
 +
#In the /home/acedb/kimberly/citace_upload/ro/ontology2ace directory on tazendra, use the wget command and the above URL to download the ro.obo input file
 +
#Greek characters used in some of the definitions for the RO terms (as an example, see id: BFO:0000063 name: precedes) need to be converted to text that ACeDB can render by doing the following (note that we tried to incorporate this as part of the script, but couldn't figure out a way to correctly recognize these characters programtically):
  
Terms to Include:
+
*In one terminal window, open the ro.obo file in a text editor
Include term where id namespace = BFO, RO, lower case text with no namespace: (e.g. results_in_acquisition_of_features_of)
+
*In a second terminal window, open the convertGreekVim file in a text editor
 +
*Globally replace the lower case alpha and omega symbols in ro.obo by copying and pasting the appropriate global replacement commands from convertGreekVim, e.g. :%s/α/alpha/g
 +
**Sanity check:
 +
***26 substitutions of lower-case alpha on 10 lines
 +
***24 substitutions of lower-case omega on 9 lines
  
Skip term where id namespace = CARO, CL, ENVO, GO, ObsoleteClass, PATO
+
*Once the Greek characters have been replaced, run the parsing script and change the name of the output file from ro_terms.ace to ro_terms_WSnnn.ace
 
+
Skip where is_obsolete: true
+
 
+
Fields to include, mapped to ?RO_term model tag (note that we can re-use much of the ?GO_term parsing script):
+
*Id: -> RO_term
+
*Name: -> Name
+
*Status -> Valid
+
*alt_id: -> Alt_id
+
*Def: -> Definition (parse what is in quotes)
+
*Synonym: -> Synonym (parse what is in caps to match Type and then the text that is in quotes
+
*Data-version: -> Version
+
 
+
Review GO term parser for handling is_a relationships to populate Child, Parent, Index
+
 
+
Parsing report/sanity check:
+
What terms have no ancestors?
+
What terms have no descendants?
+
 
+
=Update spelling to Descendant in ?GO_term model =
+
*Update line 89 in (change e -> a)
+
 
+
= go_gpad_parser modifications =
+
/home/acedb/kimberly/citace_upload/go/gpad2ace/2018_June_test/go_gpad_parser.pl
+
 
+
Line 130 - Will instead need a mapping between annotation relation and RO term id
+
 
+
When just relation -> Annotation_relation
+
 
+
When NOT|relation -> Annotation_relation_not
+
+
This may be temporary as it was proposed at the NYC GO meeting to start using RO ids in place of text in the qualifier/relation column of at least the GPAD file
+
 
+
Eventually this should also be the case for the Annotation extension relations, but there are some used in AEs that are not in RO
+
 
+
Mappings (as of June 12th):
+
*colocalizes_with RO:0002325 110
+
*contributes_to RO:0002326 277
+
*enables RO:0002327 22903
+
*involved_in RO:0002331 32498
+
*part_of BFO:0000050 33772
+
 
+
NOT annotations 224
+
 
+
= go_oa_parser modifications =
+
/home/postgres/work/citace_upload/go_curation/get_go_annotation_ace.pm
+
 
+
Line 56 - relations are stored in the gop_qualifier table
+
 
+
Will need to output the same as above
+
 
+
Mappings (as of June 12th):
+
*colocalizes_with RO:0002325 0
+
*contributes_to RO:0002326 0
+
*enables RO:0002327 20
+
*involved_in RO:0002331 334
+
*part_of BFO:0000050 29
+
 
+
NOT annotations 0
+

Latest revision as of 14:43, 3 January 2019

Overview

  • The Relations Ontology is used in WormBase to describe relations between entities in the database, e.g. genes and GO terms.
  • The Relations Ontology was first incorporated into WormBase with the WS267 release.

?RO_term Model

 ?RO_term Name UNIQUE ?Text
          Status UNIQUE Valid
                        Obsolete
          Alt_id ?Text
          Definition UNIQUE ?Text
          Synonym Broad ?Text
                  Exact ?Text
                  Narrow ?Text
                  Related ?Text
          Child Instance ?RO_term XREF Instance_of
          Parent Instance_of ?RO_term XREF Instance
          Attribute_of GO_annotation ?GO_annotation XREF Annotation_relation
                       Not_GO_annotation ?GO_annotation XREF Annotation_relation_not 
          Index Ancestor ?RO_term XREF Descendant
                Descendant ?RO_term XREF Ancestor
          Version UNIQUE Text

Parsing Script for ?RO_term Model

  • The parsing script to generate the ro_terms.ace file is on tazendra here:
    • /home/acedb/kimberly/citace_upload/ro/ontology2ace/ro_obo_to_ro_ace.pl
  • Output file:
    • ro_terms.ace
  • There are two types of entries in ro.obo
    • Term
    • Typedef
  • At the moment, we include Term and Typedef entries according to the following criteria:
    • Include term where id namespace = BFO, RO, lower case text with no namespace: (e.g. results_in_acquisition_of_features_of)
    • Skip term where id namespace = CARO, CL, ENVO, GO, ObsoleteClass, PATO
  • Note that the terms we skip are used in the OWL representation of the Relations Ontology, but not in the OBO representation
    • For the purposes of how the Relations Ontology will initially be used in WormBase, this filtering step should be fine
  • Running the script:
  1. In the /home/acedb/kimberly/citace_upload/ro/ontology2ace directory on tazendra, use the wget command and the above URL to download the ro.obo input file
  2. Greek characters used in some of the definitions for the RO terms (as an example, see id: BFO:0000063 name: precedes) need to be converted to text that ACeDB can render by doing the following (note that we tried to incorporate this as part of the script, but couldn't figure out a way to correctly recognize these characters programtically):
  • In one terminal window, open the ro.obo file in a text editor
  • In a second terminal window, open the convertGreekVim file in a text editor
  • Globally replace the lower case alpha and omega symbols in ro.obo by copying and pasting the appropriate global replacement commands from convertGreekVim, e.g. :%s/α/alpha/g
    • Sanity check:
      • 26 substitutions of lower-case alpha on 10 lines
      • 24 substitutions of lower-case omega on 9 lines
  • Once the Greek characters have been replaced, run the parsing script and change the name of the output file from ro_terms.ace to ro_terms_WSnnn.ace