Difference between revisions of "WormBaseComparativeGenomicsTools"

From WormBaseWiki
Jump to navigationJump to search
(New page: * [#Genomic_Alignments <span class="tocnumber">1</span> <span class="toctext">Genomic Alignments</span>] ** [#Methods <span class="tocnumber">1.1</span> <span class="toctext">Methods</span...)
Line 1: Line 1:
* [#Genomic_Alignments <span class="tocnumber">1</span> <span class="toctext">Genomic Alignments</span>]
** [#Methods <span class="tocnumber">1.1</span> <span class="toctext">Methods</span>]
*** [#WABA <span class="tocnumber">1.1.1</span> <span class="toctext">WABA</span>]
*** [#PECAN <span class="tocnumber">1.1.2</span> <span class="toctext">PECAN</span>]
** [#Viewers <span class="tocnumber">1.2</span> <span class="toctext">Viewers</span>]
** [#Raw_Data <span class="tocnumber">1.3</span> <span class="toctext">Raw Data</span>]
* [#Phylogenetics <span class="tocnumber">2</span> <span class="toctext">Phylogenetics</span>]
** [#Imported_Data <span class="tocnumber">2.1</span> <span class="toctext">Imported Data</span>]
*** [#TreeFam <span class="tocnumber">2.1.1</span> <span class="toctext">TreeFam</span>]
*** [#Inparanoid <span class="tocnumber">2.1.2</span> <span class="toctext">Inparanoid</span>]
*** [#KOGs <span class="tocnumber">2.1.3</span> <span class="toctext">KOGs</span>]
** [#WormBase_Data <span class="tocnumber">2.2</span> <span class="toctext">WormBase Data</span>]
*** [#Compara_Orthologs <span class="tocnumber">2.2.1</span> <span class="toctext">Compara Orthologs</span>]
*** [#Curated_Orthologs <span class="tocnumber">2.2.2</span> <span class="toctext">Curated Orthologs</span>]
*** [#EnsEMBL_gene_trees <span class="tocnumber">2.2.3</span> <span class="toctext">EnsEMBL gene trees</span>]
* [#pitfalls <span class="tocnumber">3</span> <span class="toctext">pitfalls</span>]
** [#orthologs <span class="tocnumber">3.1</span> <span class="toctext">orthologs</span>]
** [#data_export <span class="tocnumber">3.2</span> <span class="toctext">data export</span>]
= Genomic Alignments =
= Genomic Alignments =

Revision as of 21:41, 23 September 2007

Genomic Alignments



pairwise whole genome alignments using [UCSC WABA]. Tha data is available from the Gene pages (link is called syntenic alignment). Or in the ACeDB database as Homol_data on the C.elegans and C.briggsae chromosomes.


Multiple Genome Aligner [EBI PECAN] used in the [EnsEMBL Compara pipeline to produce multiple alignments. PECAN replaced MLAGAN in the EnsEMBL-compara pipeline between WS173 and WS174.


  • preview of the GBrowse based new synteny viewer [dev.wormbase.org]
  • There is also a demo site showing the multiple alignmets ([PECAN test]). The test-website allows to search for C.elegans WBGeneIDs and shows C.elegans/C.briggsae/C.remanei/C.brenneri alignments in the region overlapping the gene (exons are highlighted).

Raw Data

  • chromosome GFF files from the releases files contain WABA entries
  • ACeDB contains WABA homol_data on the chromosomes
  • EnsEMBL mySQL dumps are available on request from the Sanger FTP server [WS176 based dumps] containing 4 ensembl-core database + 2 ensembl-compara databases (1 compara-multiplealignment / 1 compara-ortholog).


Imported Data


TreeFam data is updated during the regular build from the [treefam.org] database. It is also viewable on the gene pages as picture of a phylogenetic tree.


Updated when the [Inparanoid] database is updated. 3 different clusters are shown on the gene pages (nematode / metazoa / rest).


[KOGs] are clusters of eukaryotic orthology groups and are updated when the source database is. The clusters show predicted orthologues in fully seqeunced eukayotes.

WormBase Data

Compara Orthologs

Orthologue relationships of genes represented in WormBase (C.elegans / C. briggsae / C.remanei / C.brenneri) are predicted during the regular build and included in the database (C.remanei rothologs are shown as Ortholog_others and C.brenneri orthologs are masked at the moment). Orthologue_others (viewable in the Treeview of the Gene pages) contain orthologues to genes which are not (yet) included in the ACeDB database. The prediction of the nematode orthologs is based on conservation of gene order and homology in syntenic regions. But we might update to using the genetree code from EnsEMBL-compara for higher specificity depending on how much sensitivity we loose.

Curated Orthologs

Ortholog relationships published and/or submitted are shown with their supporting evidence (papers / authors / persons) in addition to the predictions. They are also viewable on the Gene pages if available.

EnsEMBL gene trees

A tree based method to determine ortholoy/paralogy relationships of proteins. Currently being tested by WormBase and being used by the main EnsEMBL releases.



  • different programs predict sometimes different orthologs for a gene. To make sure you pick the most probable ortholog it makes sense to view ALL available data and make your decision based on that. You can also include available phenotypes for RNAi/Knockouts as well as expression profiles in your decision.
  • some genes have duplications leading to one to many or many to many relationsships.
  • especially predicted gene models from newly sequenced organisms (briggsae/remanei/brenneri) are not always 100% correctly predicted leading to unclear predictions. If you see a case, submit your comments to WormBase (link at the bottom of the pages), so it can be fixed as fast as possible.

data export

  • try WormMart for data mining (the Orthologs are in Filter: Homologs / Orthologs)
    • Inparanoid, KOGs and Comara can be accessed this way
  • BioMart contains EnsEMBL orthologs based on WS170 if you prefer to use BioMart to ACeDB
  • use flatfiles or the ACeDB downloads for quicker local analysis
  • WS180 will contain additional orthology data from L. Hilier and E. Schwarz as well as OMIM.