Difference between revisions of "WormBase Genomes"

From WormBaseWiki
Jump to navigationJump to search
m (→‎The current genomes: comment formatting and warning highlights)
Line 35: Line 35:
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
 +
!  No.
 
!  Clade
 
!  Clade
 
!  Species
 
!  Species
Line 44: Line 45:
  
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 1
 
| V
 
| V
 
| [[Caenorhabditis briggsae|''Caenorhabditis briggsae''<br>strain AF16]]
 
| [[Caenorhabditis briggsae|''Caenorhabditis briggsae''<br>strain AF16]]
Line 56: Line 58:
  
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 2
 
| V
 
| V
 
| [[Caenorhabditis species 9|''Caenorhabditis species 9''<br>strain JU1422]]
 
| [[Caenorhabditis species 9|''Caenorhabditis species 9''<br>strain JU1422]]
Line 67: Line 70:
 
* [Feb 2012] <font color=red>'''WARNING'''</font> The C. species 9 assembly in WormBase WS226-WS230 has some confirmed contamination by C. species 7. Since species 9 has mostly species 9 genes and some species 7 genes, there is actually added value to having it, as long as people know of the potential problem with "apparent paralogs" that are actually sp. 7 genes.
 
* [Feb 2012] <font color=red>'''WARNING'''</font> The C. species 9 assembly in WormBase WS226-WS230 has some confirmed contamination by C. species 7. Since species 9 has mostly species 9 genes and some species 7 genes, there is actually added value to having it, as long as people know of the potential problem with "apparent paralogs" that are actually sp. 7 genes.
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 3
 
| V
 
| V
 
| [[Caenorhabditis species 5|''Caenorhabditis species 5''<br>strain DRD-2008/JU800]]
 
| [[Caenorhabditis species 5|''Caenorhabditis species 5''<br>strain DRD-2008/JU800]]
Line 76: Line 80:
 
*First added in WS226
 
*First added in WS226
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 4
 
| V
 
| V
 
| [[Caenorhabditis remanei|''Caenorhabditis remanei''<br>strain PB4641]]
 
| [[Caenorhabditis remanei|''Caenorhabditis remanei''<br>strain PB4641]]
Line 85: Line 90:
 
*First added in WS185
 
*First added in WS185
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 5
 
| V
 
| V
 
| [[Caenorhabditis species 11|''Caenorhabditis species 11''<br>strain JU1373]]
 
| [[Caenorhabditis species 11|''Caenorhabditis species 11''<br>strain JU1373]]
Line 95: Line 101:
 
*Replaced genes by RNAseq-based gene set in WS227.  
 
*Replaced genes by RNAseq-based gene set in WS227.  
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 6
 
| V
 
| V
 
| [[Caenorhabditis brenneri|''Caenorhabditis brenneri''<br>strain PB2801]]<br>(species 4)
 
| [[Caenorhabditis brenneri|''Caenorhabditis brenneri''<br>strain PB2801]]<br>(species 4)
Line 106: Line 113:
 
*<font color=red>'''Warning'''</font> WS223-WS226 genome assembly is not in sync with the annotation files or INSDC. Please use WS227+
 
*<font color=red>'''Warning'''</font> WS223-WS226 genome assembly is not in sync with the annotation files or INSDC. Please use WS227+
 
|-bgcolor="#FFFF33"
 
|-bgcolor="#FFFF33"
 +
| 7
 
| V
 
| V
 
| [[Caenorhabditis elegans|'''''Caenorhabditis elegans''<br>strain Bristol N2''']]
 
| [[Caenorhabditis elegans|'''''Caenorhabditis elegans''<br>strain Bristol N2''']]
Line 115: Line 123:
 
*First added in WS1
 
*First added in WS1
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 8
 
| V
 
| V
 
| [[Caenorhabditis species 7|''Caenorhabditis species 7''<br>strain JU1286]]
 
| [[Caenorhabditis species 7|''Caenorhabditis species 7''<br>strain JU1286]]
Line 125: Line 134:
 
*<font color=red>'''WARNING'''</font> the genome sequence contains contaminations.
 
*<font color=red>'''WARNING'''</font> the genome sequence contains contaminations.
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 9
 
| V
 
| V
 
| [[Caenorhabditis japonica|''Caenorhabditis japonica''<br>strain DF5080]]
 
| [[Caenorhabditis japonica|''Caenorhabditis japonica''<br>strain DF5080]]
Line 136: Line 146:
 
*[Oct 2011] new assembly in WS227
 
*[Oct 2011] new assembly in WS227
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 10
 
| V
 
| V
 
| [[Caenorhabditis angaria|''Caenorhabditis angaria''<br>strain PS1010]]<br>(species 3)
 
| [[Caenorhabditis angaria|''Caenorhabditis angaria''<br>strain PS1010]]<br>(species 3)
Line 146: Line 157:
 
*[Jan 2011] This species now has an official name of '''C. angaria'''
 
*[Jan 2011] This species now has an official name of '''C. angaria'''
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 11
 
| V
 
| V
 
| [[Haemonchus contortus|''Haemonchus contortus'']]
 
| [[Haemonchus contortus|''Haemonchus contortus'']]
Line 155: Line 167:
 
*First added in WS209.
 
*First added in WS209.
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 12
 
| V
 
| V
 
| [[Heterorhabditis bacteriophora|''Heterorhabditis bacteriophora''<br>strain M31e]]
 
| [[Heterorhabditis bacteriophora|''Heterorhabditis bacteriophora''<br>strain M31e]]
Line 166: Line 179:
 
*[Sep 2011] Gene set and Annotations are being worked on.
 
*[Sep 2011] Gene set and Annotations are being worked on.
 
|-bgcolor="#FFFF99"
 
|-bgcolor="#FFFF99"
 +
| 13
 
| V
 
| V
 
| [[Pristionchus pacificus|''Pristionchus pacificus''<br>strain PS312]]
 
| [[Pristionchus pacificus|''Pristionchus pacificus''<br>strain PS312]]
Line 176: Line 190:
 
*[16 December 2010] Updated to the newest assembly and geneset in WS221  
 
*[16 December 2010] Updated to the newest assembly and geneset in WS221  
 
|-bgcolor="#99FF99"
 
|-bgcolor="#99FF99"
 +
| 14
 
| IV
 
| IV
 
| [[Meloidogyne incognita|''Meloidogyne incognita''<br>strain Morelos]]
 
| [[Meloidogyne incognita|''Meloidogyne incognita''<br>strain Morelos]]
Line 186: Line 201:
 
*Genes are not yet available. The official M.incognita genes are only available at [http://www.inra.fr/meloidogyne_incognita/genomic_resources INRA] and their structure hasn't been made public.
 
*Genes are not yet available. The official M.incognita genes are only available at [http://www.inra.fr/meloidogyne_incognita/genomic_resources INRA] and their structure hasn't been made public.
 
|-bgcolor="#99FF99"
 
|-bgcolor="#99FF99"
 +
| 15
 
| IV
 
| IV
 
| [[Meloidogyne hapla|''Meloidogyne hapla''<br>strain VW9]]
 
| [[Meloidogyne hapla|''Meloidogyne hapla''<br>strain VW9]]
Line 195: Line 211:
 
*First added in WS204
 
*First added in WS204
 
|-bgcolor="#99FF99"
 
|-bgcolor="#99FF99"
 +
| 16
 
| IV
 
| IV
 
| [[Strongyloides ratti|''Strongyloides ratti''<br>natural isolate]]
 
| [[Strongyloides ratti|''Strongyloides ratti''<br>natural isolate]]
Line 205: Line 222:
 
*First added in WS226
 
*First added in WS226
 
|-bgcolor="#99FF99"
 
|-bgcolor="#99FF99"
 +
| 17
 
| IV
 
| IV
 
| [[Bursaphelenchus xylophilus|''Bursaphelenchus xylophilus''<br>strain Ka4C1]]
 
| [[Bursaphelenchus xylophilus|''Bursaphelenchus xylophilus''<br>strain Ka4C1]]
Line 215: Line 233:
 
*[Nov 2011] First added in WS229
 
*[Nov 2011] First added in WS229
 
|-bgcolor="#FF9900"
 
|-bgcolor="#FF9900"
 +
| 18
 
| III
 
| III
 
| [[Ascaris suum|''Ascaris suum''<br>natural isolate]]
 
| [[Ascaris suum|''Ascaris suum''<br>natural isolate]]
Line 226: Line 245:
 
*[Nov 2011] added a reference gene set
 
*[Nov 2011] added a reference gene set
 
|-bgcolor="#FF9900"
 
|-bgcolor="#FF9900"
 +
| 19
 
| III
 
| III
 
| [[Brugia malayi|''Brugia malayi''<br>TRS]]
 
| [[Brugia malayi|''Brugia malayi''<br>TRS]]
Line 237: Line 257:
 
*[Dec 2010] merged Augustus gene predictions from Erich Schwarz into WS216
 
*[Dec 2010] merged Augustus gene predictions from Erich Schwarz into WS216
 
|-bgcolor="#33FFFF"
 
|-bgcolor="#33FFFF"
 +
| 20
 
| I
 
| I
 
| [[Trichinella spiralis|''Trichinella spiralis''<br>strain ISS 195]]
 
| [[Trichinella spiralis|''Trichinella spiralis''<br>strain ISS 195]]

Revision as of 10:35, 28 February 2012

WormBase Genomes

This is a record of the current and proposed set of genomes in WormBase.

We may, of course, alter our plans for which species to include as circumstances dictate and so the list of organisms which should be included should be treated as somewhat tentative.

The following are some of the columns in the Genome Table.

Clade

The Major Clades of Blaxter et al 1998 ("Bclades"), as systematised by De Ley and Blaxter 2002-2004.

Gene-set

The origin of the gene-set. One of: Curated (curated by WormBase), Predicted (made by a gene-prediction pipeline), External (produced by another group), None (no gene-set available).

Genome

The Genome column in the table gives the assembly size and a link to the genome in WormBase, or the approximate size if it has not been assembled.

Gene

The Genes column in the table indicates whether gene structures have been added to WormBase.

Assembly

Which lab did the assembly. The sequence coverage. The number of Supercontigs. The Supercontig N50. And anything else that we know about it.

The current genomes

No. Clade Species NCBI Taxon Genome Gene-set Assembly Comments
1 V Caenorhabditis briggsae
strain AF16
6238 108419768 bp Curated WashU
  • First added in WS132
  • [Sept 2010] New assembly from Erich Haag being worked on.
  • [Feb 2011] updated in WS224
2 V Caenorhabditis species 9
strain JU1422
870437 204396809 bp External WashU
Supercontigs: 7636
N50: 196652
  • First added in WS226
  • [Dec 2011]Update on contamination: There is no evidence that C. sp. 9 underwent cross-contamination, and the "C sp. 7" contaminants in the sp. 9 genome and transcriptome may actually be sp. 9 contaminants which got put into sp. 7.
  • [Feb 2012] WARNING The C. species 9 assembly in WormBase WS226-WS230 has some confirmed contamination by C. species 7. Since species 9 has mostly species 9 genes and some species 7 genes, there is actually added value to having it, as long as people know of the potential problem with "apparent paralogs" that are actually sp. 7 genes.
3 V Caenorhabditis species 5
strain DRD-2008/JU800
497829 131797386 bp External WashU
Coverage: 150x
Supercontigs: 15261
N50: 25,228
  • First added in WS226
4 V Caenorhabditis remanei
strain PB4641
31234 145500347 bp Curated WashU
Coverage: 9.2x
Supercontigs: 3670
N50: 461060
  • First added in WS185
5 V Caenorhabditis species 11
strain JU1373
886184 79321433 bp External WashU
Coverage: 19.1x
Supercontigs: 665
N50: 20921866
  • First added in WS226
  • Replaced genes by RNAseq-based gene set in WS227.
6 V Caenorhabditis brenneri
strain PB2801

(species 4)
135651 190421492 bp Curated WashU
Coverage: 9.5x
Supercontigs: 3305
N50: 368319
  • First added in WS196
  • [Jan 2011] The current assembly contains quite a bit of heterozygosity.
  • Warning WS223-WS226 genome assembly is not in sync with the annotation files or INSDC. Please use WS227+
7 V Caenorhabditis elegans
strain Bristol N2
6239 100272276 bp Curated WashU/Sanger
Coverage: 6x
  • First added in WS1
8 V Caenorhabditis species 7
strain JU1286
870436 None WashU
  • First added in WS226.
  • WARNING the genome sequence contains contaminations.
9 V Caenorhabditis japonica
strain DF5080
281687 166565019 bp Curated WashU
Coverage: 22x
Supercontigs: 18817
N50: 94149
  • First added in WS195
  • [Jan 2011] New/improved assembly is being worked on at WashU.
  • [Oct 2011] new assembly in WS227
10 V Caenorhabditis angaria
strain PS1010

(species 3)
96668 79761545 bp External CalTech
Supercontigs: 33559
N50: 9453
  • First added in WS218
  • [Jan 2011] This species now has an official name of C. angaria
11 V Haemonchus contortus 6289 297975349 bp Predicted Sanger
Supercontigs: 59707
N50: 13338
  • First added in WS209.
12 V Heterorhabditis bacteriophora
strain M31e
37862 76974349 bp None WashU
Coverage: 26.1x
Supercontigs: 1240
N50: 312328
  • First added in WS229
  • [Jan 2011] Submitted to GenBank - Accession: EF043402
  • [Sep 2011] Gene set and Annotations are being worked on.
13 V Pristionchus pacificus
strain PS312
54126 172773083 bp External WashU/MPI
Coverage: 8.92x
Supercontigs: 18083
N50: 1244534
  • First added in WS194
  • [16 December 2010] Updated to the newest assembly and geneset in WS221
14 IV Meloidogyne incognita
strain Morelos
6306 82095019 bp None INRA
Supercontigs: 9538
N50: 83000
  • First added in WS205
  • Genes are not yet available. The official M.incognita genes are only available at INRA and their structure hasn't been made public.
15 IV Meloidogyne hapla
strain VW9
6305 53017507 bp External NCSU hapla.org
Supercontigs: 3452
84000
  • First added in WS204
16 IV Strongyloides ratti
natural isolate
34506 52638471 bp Predicted Sanger
Coverage: 70x
Supercontigs: 2184
N50: 359029
  • [Jan 2011] draft assembly in GenBank
  • First added in WS226
17 IV Bursaphelenchus xylophilus
strain Ka4C1
6326 74561461 bp Predicted Sanger
Coverage: 13x
Supercontigs: 5527
N50: 1158000
  • [Sept 2011] published in PLOS Pathogens
  • [Nov 2011] First added in WS229
18 III Ascaris suum
natural isolate
6253 272782664 bp External Davis
Coverage: 70x
Supercontig: 29831
N50: 407899
  • First added in WS229
  • [Oct 2011] integrated the Davis genome without a reference gene set.
  • [Nov 2011] added a reference gene set
19 III Brugia malayi
TRS
6279 95814443 bp External / Predicted TIGR -> WashU/Sanger
Supercontigs: 27210
N50: 37841
  • First added in WS185
  • [Sept 2010] Currently using the old TIGR assembly. Waiting for WashU (did assembly) and Sanger (did gene models) to publish, then we will use the new assembly.
  • [Dec 2010] merged Augustus gene predictions from Erich Schwarz into WS216
20 I Trichinella spiralis
strain ISS 195
6334 56779425 bp External WashU
Supercontigs: 6863
N50: 3383625
  • [Sept 2010] Being assembled.
  • [Feb 2011] published in Nature.
  • [Mar 2011] First added in WS225

Genomes coming soon

Clade Species NCBI Taxon Genome Gene-set Assembly Comments
V Steinernema carpocapsae 34508 230 Mb None LANGEBIO / CalTech [May 2011] Being assembled.
V Caenorhabditis drosophilae 96641 None WashU [Sept 2010] being assembled
III Onchocerca volvulus 6282 None Sanger
IV Globodera pallida 36090 None Sanger
V Nippostrongylus brasiliensis 27835 None Sanger
V Strongyloides ransomi 553534 None Sanger
V Teladorsagia circumcincta 45464 None Sanger
II Trichuris muris 70415 None Sanger
V Ancylostoma caninum 29170 None Washu
V Ancylostoma ceylanicum 53326 None Washu
V Ancylostoma duodenale 51022 None Washu
V Cooperia oncophora 27828 None Washu
V Dictyocaulus viviparus 29172 None Washu
V Necator americanus 51031 None Washu
V Nematodirus battus 28839 None Washu
V Oesophagostomum dentatum 61180 None Washu
V Ostertagia ostertagi 6317 None Washu
V Teladorsagia circumcincta 45464 None Washu

Phylogeny

Given my understanding of the current phylogenetic literature (and based on personal communications with Karin Kiontke,David Fitch and Mark Blaxter), the correct guide tree would be:

((((((((((((C.briggsae,C.sp9),C.sp5),C.remanei),(C.sp11,C.brenneri)),C.elegans),(C.sp7,C.japonica)),C.angaria),(H.contortus,H.bacteriophora)),P.pacificus),((M.incognita,M.hapla), (S.ratti,B.xylophilus))),(A.suum,B.malayi)),T.spiralis);

Treeprint5.png

See also