Difference between revisions of "Updating go.ace file"
From WormBaseWiki
Jump to navigationJump to searchLine 6: | Line 6: | ||
*Model changes - added Status and Synonym tags. | *Model changes - added Status and Synonym tags. | ||
+ | |||
+ | *Model changes for WS250 - Remove Term tag and put GO Name information in Name tag - needs script change | ||
WormBase ?GO_term Model: | WormBase ?GO_term Model: |
Revision as of 20:04, 16 June 2015
Updating the GO_term.ace File - Script Specifications
- Output file: go.ace
- Model changes - added Status and Synonym tags.
- Model changes for WS250 - Remove Term tag and put GO Name information in Name tag - needs script change
WormBase ?GO_term Model:
?GO_term Name ?Text Status UNIQUE Valid Obsolete Definition ?Text Term ?Text Synonym Broad ?Text Exact ?Text Narrow ?Text Related ?Text Type UNIQUE Biological_process Cellular_component Molecular_function Child Instance ?GO_term XREF Instance_of Component ?GO_term XREF Component_of Parent Instance_of ?GO_term XREF Instance Component_of ?GO_term XREF Component Attribute_of Cell ?Cell XREF GO_term Motif ?Motif XREF GO_term Gene ?Gene XREF GO_term CDS ?CDS XREF GO_term Sequence ?Sequence XREF GO_term Transcript ?Transcript XREF GO_term Phenotype ?Phenotype XREF GO_term Index Ancestor ?GO_term XREF Descendent Descendent ?GO_term XREF Ancestor Anatomy_term ?Anatomy_term XREF GO_term Homology_group ?Homology_group XREF GO_term Expr_pattern ?Expr_pattern XREF GO_term Picture ?Picture XREF Cellular_component Version UNIQUE Text UNIQUE Text
Sample GO term in .obo file:
[Term] id: GO:0000003 name: reproduction namespace: biological_process alt_id: GO:0019952 alt_id: GO:0050876 def: "The production by an organism of new individuals that contain some portion of their genetic material inherited from that organism." [GOC:go_curators, GOC:isa_complete, ISBN:0198506732] subset: goslim_generic subset: goslim_pir subset: goslim_plant subset: gosubset_prok synonym: "reproductive physiological process" EXACT [] xref: Wikipedia:Reproduction is_a: GO:0008150 ! biological_process
Sample GO Header:
format-version: 1.2 data-version: 2013-06-24 date: 23:06:2013 14:54 saved-by: rl auto-generated-by: TermGenie 1.0 subsetdef: Cross_product_review "Involved_in" subsetdef: gocheck_do_not_annotate "Term not to be used for direct annotation" subsetdef: gocheck_do_not_manually_annotate "Term not to be used for direct manual annotation" subsetdef: goslim_aspergillus "Aspergillus GO slim" subsetdef: goslim_candida "Candida GO slim" subsetdef: goslim_generic "Generic GO slim" subsetdef: goslim_metagenomics "Metagenomics GO slim" subsetdef: goslim_pir "PIR GO slim" subsetdef: goslim_plant "Plant GO slim" subsetdef: goslim_pombe "Fission yeast GO slim" subsetdef: goslim_yeast "Yeast GO slim" subsetdef: gosubset_prok "Prokaryotic GO subset" subsetdef: mf_needs_review "Catalytic activity terms in need of attention" subsetdef: termgenie_unvetted "Terms created by TermGenie that do not follow a template and require additional vetting by editors" subsetdef: virus_checked "Viral overhaul terms" synonymtypedef: systematic_synonym "Systematic synonym" EXACT default-namespace: gene_ontology remark: cvs version: $Revision: 9500 $ ontology: go
- Mapping from .obo file to .ace file:
.ace tag name | .obo tag name | Action |
---|---|---|
Name | id: | Add corresponding value in double quotes. |
Status | is_obsolete: | If tag is not present, Status should be set to Valid. If tag is present, Status should be set to Obsolete. |
Definition | def: | Add corresponding value including double quotes. Omit information in brackets at the end of the definition. |
Term | name: | Add corresponding value in double quotes. |
Broad, Exact, Narrow, or Related | synonym: | For each synonym, check text after double quotes to populate Broad, Exact, Narrow, or Related. For ?Text add value in double quotes. Ignore information in brackets. Note that a single GO term object can have multiple synonyms. |
Type | namespace: | Make first letter upper case and add corresponding value. |
Instance | is_a: | Take object Name (id:) and look for that value in the is_a: tag for all entries. Fill in .ace tag with corresponding id: from obo file in double quotes. Can have multiple values. |
Component | relationship: part_of | In each relationship: part_of tag, look for Name. Fill in .ace tag with corresponding id: in double quotes. Can have multiple values. |
Instance_of | is_a: | For Name, look at is_a tag. Fill in .ace with corresponding value(s) in double quotes. Can have multiple values. |
Component_of | relationship: part_of | For Name, look at relationship: part_of. Fill in .ace with corresponding value(s) in double quotes. Can have multiple values. |
Ancestor | is_a: and relationship: part_of | For each Name, look at is_a: and relationship: part_of. Fill in .ace with each corresponding value in double quotes. Then, for each value filled in, iterate and look at their is_a: and relationship: part_of. Continue to fill in .ace with corresponding value until the root node is reached, for which no is_a or relationship: part_of exists. Remove any redundant values from list. |
Descendent | is_a: and relationship: part_of | For each is_a: and relationship: part_of, look for Name. If Name matches value in either tag, fill in .ace with corresponding id: in double qutoes. For each corresponding id:, iterate and look for id: in is_a: and relationship: part_of. Continue to iterate until each id: is no longer found in is_a: and relationship: part_of. Remove any redundant values from list. |
Version | In header, remark: cvs version: | Gene Ontology followed by value in between $ signs after cvs version:. There is a trailing white space after the last digit of the version number; we'll leave that white space in for now. |
- Some possible terms to check:
- GO:0005635 nuclear envelope
- GO:0007192 adenylate cyclase-activating serotonin receptor signaling pathway
- GO:0008340 determination of adult lifespan
- GO:0003729 mRNA binding
- GO:1900529 regulation of cell shape involved in cellular response to glucose starvation
Examples: Terms in .obo and .ace
I found it helpful to use the GOOSE query tool to check the ancestors and descendents: http://berkeleybop.org/goose
For the term 'nuclear envelope' GO:0005635
.obo [Term] id: GO:0005635 name: nuclear envelope namespace: cellular_component alt_id: GO:0005636 def: "The double lipid bilayer enclosing the nucleus and separating its contents from the rest of the cytoplasm; includes the intermembrane space, a gap of width 20-40 nm (also called the perinuclear space)." [ISBN:0198547684] subset: goslim_generic subset: goslim_plant xref: Wikipedia:Nuclear_envelope is_a: GO:0031967 ! organelle envelope is_a: GO:0044428 ! nuclear part relationship: part_of GO:0012505 ! endomembrane system .ace: GO_term : "GO:0005635" Status "Valid" Definition "The double lipid bilayer enclosing the nucleus and separating its contents from the rest of the cytoplasm; includes the intermembrane space, a gap of width 20-40 nm (also called the perinuclear space)." Term "nuclear envelope" Cellular_component Component "GO:0034992" Component "GO:0005641' Component "GO:0031965" Component "GO:0005643" Component "GO:0044195" Instance_of "GO:0044428" Instance_of "GO:0031967" Component_of "GO:0012505" Ancestor "GO:0005575" Ancestor "GO:0005623" Ancestor "GO:0044464" Ancestor "GO:0012505" Ancestor "GO:0031975" Ancestor "GO:0031967" Ancestor "GO:0005622" Ancestor "GO:0044424" Ancestor "GO:0043229" Ancestor "GO:0043231" Ancestor "GO:0005634" Ancestor "GO:0044428" Ancestor "GO:0044446" Ancestor "GO:0043226" Ancestor "GO:0043227" Ancestor "GO:0044422" Descendent "GO:0034992" Descendent "GO:0005641" Descendent "GO:0031965" Descendent "GO:0005643" Descendent "GO:0044195" Descendent "GO:0034993" Descendent "GO:0005637" Descendent "GO:0044453" Descendent "GO:0005640" Descendent "GO:0031229" Descendent "GO:0005639" Descendent "GO:0002180" Descendent "GO:0031316" Descendent "GO:0031308" Descendent "GO:0031309" Descendent "GO:0044613" Descendent "GO:0044614" Descendent "GO:0044611" Descendent "GO:0044612" Descendent "GO:0044615" Descendent "GO:0031080" Descendent "GO:0070762" Version "Gene Ontology Revision: 9500"
For the term adenylate cyclase-activating serotonin receptor signaling pathway GO:0007192
.obo [Term] id: GO:0007192 name: adenylate cyclase-activating serotonin receptor signaling pathway namespace: biological_process def: "The series of molecular signals generated as a consequence of a serotonin receptor binding to its physiological ligand, where the pathway proceeds with activation of adenylyl cyclase and a subsequent increase in the concentration of cyclic AMP (cAMP)." [GOC:dph, GOC:mah, GOC:signaling, GOC:tb] synonym: "activation of adenylate cyclase activity by serotonin receptor signalling pathway" RELATED [GOC:mah] synonym: "serotonin receptor, adenylate cyclase activating pathway" RELATED [GOC:dph, GOC:tb] synonym: "serotonin receptor, adenylyl cyclase activating pathway" EXACT [] is_a: GO:0007189 ! adenylate cyclase-activating G-protein coupled receptor signaling pathway is_a: GO:0007210 ! serotonin receptor signaling pathway
.ace GO_term : "GO:00007192" Status "Valid" Definition "The series of molecular signals generated as a consequence of a serotonin receptor binding to its physiological ligand, where the pathway proceeds with activation of adenylyl cyclase and a subsequent increase in the concentration of cyclic AMP (cAMP)." Term "adenylate cyclase-activating serotonin receptor signaling pathway" Related "activation of adenylate cyclase activity by serotonin receptor signalling pathway" Related "serotonin receptor, adenylate cyclase activating pathway" Exact "serotonin receptor, adenylyl cyclase activating pathway" Biological_process Instance_of "GO:0007189" Instance_of "GO:0007210" Ancestor "GO:0007189" Ancestor "GO:0007210" Ancestor "GO:0007188" Ancestor "GO:0007186" Ancestor "GO:0007166" Ancestor "GO:0007165" Ancestor "GO:0007187" Ancestor "GO:0050794" Ancestor "GO:0051716" Ancestor "GO:0050789" Ancestor "GO:0044763" Ancestor "GO:0050896" Ancestor "GO:0009987" Ancestor "GO:0044699" Ancestor "GO:0008150" Ancestor "GO:0065007" Ancestor "GO:0007154" Ancestor "GO:0044700" Ancestor "GO:0023052" Version "Gene Ontology Revision: 9500"
For the term 'determination of adult lifespan' GO:0008340
.obo [Term] id: GO:0008340 name: determination of adult lifespan namespace: biological_process def: "The control of viability and duration in the adult phase of the life-cycle." [GOC:ems] is_a: GO:0044707 ! single-multicellular organism process relationship: part_of GO:0010259 ! multicellular organismal aging
.ace GO_term : "GO:0008340" Status "Valid" Definition "The control of viability and duration in the adult phase of the life-cycle." Term "determination of adult lifespan" Biological_process Component "GO:1901047" Instance_of "GO:0044707" Component_of "GO:0010259" Ancestor "GO:0010259" Ancestor "GO:0007275" Ancestor "GO:0007568" Ancestor "GO:0044707" Ancestor "GO:0044767" Ancestor "GO:0032501" Ancestor "GO:0044699" Ancestor "GO:0032502" Ancestor "GO:0008150" Descendent "GO:1901047" Version "Gene Ontology Revision: 9500"
For the term 'mRNA binding' GO:0003729
.obo [Term] id: GO:0003729 name: mRNA binding namespace: molecular_function def: "Interacting selectively and non-covalently with messenger RNA (mRNA), an intermediate molecule between DNA and protein. mRNA includes UTR and coding sequences, but does not contain introns." [GOC:kmv, SO:0000234] subset: goslim_generic subset: goslim_yeast subset: gosubset_prok is_a: GO:0003723 ! RNA binding
.ace GO_term : "GO:0003729" Status "Valid" Definition "Interacting selectively and non-covalently with messenger RNA (mRNA), an intermediate molecule between DNA and protein. mRNA includes UTR and coding sequences, but does not contain introns." Term "mRNA binding" Molecular_function Instance "GO:0030350" Instance "GO:0003730" Instance "GO:0048027" Instance "GO:0008143" Instance "GO:0035368" Instance_of "GO:0003723" Ancestor "GO:0003723" Ancestor "GO:0003676" Ancestor "GO:1901363" Ancestor "GO:0097159" Ancestor "GO:0005488" Ancestor "GO:0003674" Descendent "GO:0030350" Descendent "GO:0003730" Descendent "GO:0048027" Descendent "GO:0008143" Descendent "GO:0035368" Descendent "GO:0035925" Version "Gene Ontology Revision: 9500"
For the term 'regulation of cell shape involved in cellular response to glucose starvation', GO:1900529
.obo [Term] id: GO:1900529 name: regulation of cell shape involved in cellular response to glucose starvation namespace: biological_process def: "OBSOLETE. Any regulation of cell shape that is involved in cellular response to glucose starvation." [GOC:al, GOC:TermGenie, PMID:9135147] comment: This term was obsoleted at the TermGenie Gatekeeper stage. is_obsolete: true created_by: al creation_date: 2012-05-08T01:43:38Z
.ace GO_term : "GO:1900529" Status "Obsolete" Definition "OBSOLETE. Any regulation of cell shape that is involved in cellular response to glucose starvation." Term "regulation of cell shape involved in cellular response to glucose starvation" Biological_process Version "Gene Ontology Revision: 9500"
Back to Gene Ontology