Ontology Annotator

From WormBaseWiki
Jump to navigationJump to search

Description

The Ontology Annotator (OA) is a curation tool developed by WormBase for a variety of curation purposes including the curation of phenotypes, attaching GO terms to genes, genetic interactions, transgenes, free-text descriptions of genes, and several other data-types. The OA uses CGI, javascript and a postgreSQL database and is web-based, eliminating issues that may arise due to operating system differences and the need for an user to install other dependent software. The OA in many ways is similar to [Phenote] (phenote.org). The OA mainly consists of an Editor, where new data is entered or where pre-existing data can be queried and edited, a Data-Table for data review and a Term-Information panel where information about a term like IDs, synonyms etc., is displayed. The OA includes features like term autocomplete from pre-loaded ontologies, a fast AJAX loading of terms, ability to save/query to/from a postgreSQL database, duplicating data, editing several lines of data at once, and filtering the data. The display to some extent can be custom-organized as columns in the Data-Table can be sorted by dragging and the width of each data-column adjusted. For complex data-types with several fields, the OA allows a tabbed organization.

List of WormBase OA configurations:

  • Antibody
  • Concise description
  • Disease
  • Expression pattern
  • Gene class
  • Gene ontology
  • Gene regulation
  • Interaction
  • Molecule
  • Movie
  • Phenotype
  • Picture
  • Process Term
  • RNAi
  • Topic
  • Transgene


The OA uses :

  • Perl CGI.
  • Yahoo!'s YUI library and a local javascript file.
  • PostgreSQL database backend (could probably be modified to other SQL databases).
  • Apache webserver.
  • Documentation for main CGI, javascript, and modules:OA docs

The above description can also be found here: Web-page for OA

Wish List

  1. Include dependencies wherever possible. For example, if making an IMP annotation for a given gene, have a gene-specific drop down menu of alleles or RNAi experiments for the WITH column. Or, if making an IGI annotation for the paper, have a drop down list of all genes mentioned in the paepr. Similarly, when entering a GO term, have the ontology (P, F, or C) get entered automatically. (From Curation Interface Meeting)
  2. Term information window - information should reflect where cursor is placed in the editor window, e.g. Paper should reflect paper info


Batch upload to OA from tab-delimited file

[New as of November 2013] A script has been written that will allow curators to upload data in bulk to the OA through the submission of a properly formatted tab-delimited (TSV) file. The script is located on Mangolassi/Tazendra at:

/home/postgres/public_html/cgi-bin/oa/scripts/populate_oa_tab_file/populate_oa_tab_file.pl

Usage

Enter (cd) the directory with the script and run by entering:

./populate_oa_tab_file.pl mangolassi #### testfile

to enter into the Sandbox OA (on Mangolassi) where '####' is the curator's WBPerson ID

./populate_oa_tab_file.pl tazendra #### realfile

to enter into the Live OA (on Tazendra) where '####' is the curator's WBPerson ID


OA's capable of accepting bulk upload

As of November 2013, the list of OA's that can accept bulk uploads via this method are as follows:

  • Transgene
  • Process Term
  • Topic
  • RNAi


Tab-delimited (TSV) file format

It is important that the TSV file be formatted properly. Each column header must be a Postgres table name into which data will be uploaded. Each column header on a single form should be a Postgres table name for the same OA such that each row in the spreadsheet/TSV file will be a single entry (with a unique PGID) in the OA/Postgres.


Back to Caltech documentation