Difference between revisions of "WormBase Genomes"
(added some coverage, supercontig and N50 data) |
(added some more assembly data) |
||
Line 160: | Line 160: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/h_contortus/ 297975349 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/h_contortus/ 297975349 bp] | ||
| Yes | | Yes | ||
− | | Sanger | + | | Sanger<br>Supercontigs: 59707<br>N50: 13338 |
| First added in WS209. | | First added in WS209. | ||
Line 170: | Line 170: | ||
| 76974349 bp | | 76974349 bp | ||
| | | | ||
− | | WashU | + | | WashU<br>Coverage: 26.1<br>Supercontigs: 1240<br>N50: 312328 |
| First added in WS229<br>[Jan 2011] Submitted to GenBank - Accession: EF043402<br>[Sep 2011] Gene set and Annotations are being worked on. | | First added in WS229<br>[Jan 2011] Submitted to GenBank - Accession: EF043402<br>[Sep 2011] Gene set and Annotations are being worked on. | ||
Line 180: | Line 180: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/p_pacificus/ 172773083 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/p_pacificus/ 172773083 bp] | ||
| Yes | | Yes | ||
− | | WashU/MPI | + | | WashU/MPI<br>Coverage: 8.92<br>Supercontigs: 18083<br>N50: 1244534 |
| First added in WS194<br>[16 December 2010] Updated to the newest assembly and geneset in WS221 | | First added in WS194<br>[16 December 2010] Updated to the newest assembly and geneset in WS221 | ||
Line 200: | Line 200: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/m_incognita/ 82095019 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/m_incognita/ 82095019 bp] | ||
| | | | ||
− | | [http://www.inra.fr/meloidogyne_incognita/genomic_resources INRA] | + | | [http://www.inra.fr/meloidogyne_incognita/genomic_resources INRA]<br>Supercontigs: 9538<br>N50: 83000 |
| First added in WS205<br>Genes are not yet available. The official M.incognita genes are only available at [http://www.inra.fr/meloidogyne_incognita/genomic_resources INRA] and their structure hasn't been made public. | | First added in WS205<br>Genes are not yet available. The official M.incognita genes are only available at [http://www.inra.fr/meloidogyne_incognita/genomic_resources INRA] and their structure hasn't been made public. | ||
Line 210: | Line 210: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/m_hapla/ 53017507 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/m_hapla/ 53017507 bp] | ||
| Yes | | Yes | ||
− | | NCSU hapla.org | + | | NCSU hapla.org<br>Supercontigs: 3452<br>84000 |
| First added in WS204 | | First added in WS204 | ||
Line 220: | Line 220: | ||
| 52638471 bp | | 52638471 bp | ||
| | | | ||
− | | Sanger | + | | Sanger<br>Coverage: 70x<br>Supercontigs: 2184<br>N50: 359029 |
| [Jan 2011] draft assembly in GenBank<br>First added in WS226 | | [Jan 2011] draft assembly in GenBank<br>First added in WS226 | ||
Line 230: | Line 230: | ||
| 74561461 bp | | 74561461 bp | ||
| Yes | | Yes | ||
− | | Sanger | + | | Sanger<br>Coverage: 13x<br>Supercontigs: 5527<br>N50: 1158000 |
| [Sept 2011] published in PLOS Pathogens<br>[Nov 2011] First added in WS229 | | [Sept 2011] published in PLOS Pathogens<br>[Nov 2011] First added in WS229 | ||
Line 240: | Line 240: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/a_suum/ 272782664 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/a_suum/ 272782664 bp] | ||
| Yes | | Yes | ||
− | | Davis | + | | Davis<br>Coverage: 70x<br>Supercontig: 29831<br>N50: 407899 |
| First added in WS229<br>[Oct 2011] integrated the Davis genome without a reference gene set.<br>[Nov 2011] added a reference gene set | | First added in WS229<br>[Oct 2011] integrated the Davis genome without a reference gene set.<br>[Nov 2011] added a reference gene set | ||
Line 250: | Line 250: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/b_malayi/ 95814443 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/b_malayi/ 95814443 bp] | ||
| Yes | | Yes | ||
− | | TIGR -> WashU/Sanger | + | | TIGR -> WashU/Sanger<br>Supercontigs: 27210<br>N50: 37841 |
| First added in WS185<br>[Sept 2010] Currently using the old TIGR assembly. Waiting for WashU (did assembly) and Sanger (did gene models) to publish, then we will use the new assembly.<br>[Dec 2010] merged Augustus gene predictions from Erich Schwarz into WS216 | | First added in WS185<br>[Sept 2010] Currently using the old TIGR assembly. Waiting for WashU (did assembly) and Sanger (did gene models) to publish, then we will use the new assembly.<br>[Dec 2010] merged Augustus gene predictions from Erich Schwarz into WS216 | ||
Line 260: | Line 260: | ||
| [http://www.wormbase.org/db/gb2/gbrowse/t_spiralis/ 56779425 bp] | | [http://www.wormbase.org/db/gb2/gbrowse/t_spiralis/ 56779425 bp] | ||
| Yes | | Yes | ||
− | | WashU | + | | WashU<br>Supercontigs: 6863<br>N50: 3383625 |
| [Sept 2010] Being assembled.<br>[Feb 2011] published in Nature.<br>[Mar 2011] First added in WS225 | | [Sept 2010] Being assembled.<br>[Feb 2011] published in Nature.<br>[Mar 2011] First added in WS225 | ||
|} | |} |
Revision as of 14:26, 15 December 2011
Contents
WormBase Genomes
This is a record of the current and proposed set of genomes in WormBase.
We may, of course, alter our plans for which species to include as circumstances dictate and so the list of organisms which should be included should be treated as somewhat tentative.
Clade
The Major Clades of Blaxter et al 1998 ("Bclades"), as systematised by De Ley and Blaxter 2002-2004.
Tiers
The different genomes in WormBase are classified in various tiers which depend on the amount of curation effort we are able to put into maintaining them.
Tier I - All efforts are made to curate the gene structures and any other genetic or metabolic information. Only C. elegans is in this group.
Tier II - Efforts are made, where practical, to manually curate the gene structure and possibly some other genomic information. WormBase 'owns' the assembly in the ENA and GenBank so that new gene annotations can be submitted to the ENA/GenBank by WormBase.
Tier III - No curation by WormBase. We will set up the genome on WormBase with any gene structures that the authors of this genome have predicted.
Tier IV - No curation by WormBase. Only transcriptome information provided by the authors and no coherent genome.
Tier V - No curation by WormBase. Only genome information provided by the authors and no coherent transcriptome.
Genome
The Genome column in the table gives the assembly size and a link to the genome in WormBase, or the approximate size if it has not been assembled.
Gene
The Genes column in the table indicates whether gene structures have been added to WormBase.
Assembly
Which lab did the assembly. The sequence coverage. The Supercontig N50. And anything else that we know about it.
The current genomes
Clade | Species | NCBI Taxon | Tier | Genome | Genes | Assembly | Comments |
---|---|---|---|---|---|---|---|
V | Caenorhabditis briggsae strain AF16 |
6238 | II | 108419768 bp | Yes | WashU | First added in WS132 [Sept 2010] New assembly from Erich Haag being worked on. [Feb 2011] updated in WS224 |
V | Caenorhabditis species 9 strain JU1422 |
870437 | V | 204396809 bp | WashU Supercontigs: 7636 N50: 196652 |
First added in WS226 [Dec 2011]Update on contamination: There is no evidence that C. sp. 9 underwent cross-contamination, and the "C sp. 7" contaminants in the sp. 9 genome and transcriptome may actually be sp. 9 contaminants which got put into sp. 7. | |
V | Caenorhabditis species 5 | ? | V | WashU | First added in WS226 | ||
V | Caenorhabditis remanei | 31234 | II | 145500347 bp | Yes | WashU Coverage: 9.2x Supercontigs: 3670 N50: 461060 |
First added in WS185 |
V | Caenorhabditis species 11 strain JU1373 |
886184 | III | 79321433 bp | Yes | WashU Coverage: 19.1 Supercontigs: 665 N50: 20921866 |
First added in WS226 Replaced genes by RNAseq-based gene set in WS227. |
V | Caenorhabditis brenneri (species 4) |
135651 | II | 190421492 bp | Yes | WashU Coverage: 9.5 Supercontigs: 3305 N50: 368319 |
First added in WS196 [Jan 2011] The current assembly contains quite a bit of heterozygosity. Warning WS223-WS226 genome assembly is not in sync with the annotation files or INSDC. Please use WS227+ |
V | Caenorhabditis elegans strain Bristol N2 |
6239 | I | 100272276 bp | Yes | WashU/Sanger Coverage: 6x |
First added in WS1 |
V | Caenorhabditis species 7 strain JU1286 |
870436 | V | WashU | First added in WS226. WARNING the genome sequence contains contaminations | ||
V | Caenorhabditis japonica strain DF5080 |
281687 | II | 166565019 bp | Yes | WashU Coverage: 22x Supercontigs: 18817 N50: 94149 |
First added in WS195 [Jan 2011] New/improved assembly is being worked on at WashU. [Oct 2011] new assembly in WS227 |
V | Caenorhabditis angaria strain PS1010 (species 3) |
96668 | III | 79761545 bp | Yes | CalTech Supercontigs: 33559 N50: 9453 |
First added in WS218 [Jan 2011] This species now has an official name of C. angaria |
V | Haemonchus contortus | 6289 | III | 297975349 bp | Yes | Sanger Supercontigs: 59707 N50: 13338 |
First added in WS209. |
V | Heterorhabditis bacteriophora strain M31e |
37862 | V | 76974349 bp | WashU Coverage: 26.1 Supercontigs: 1240 N50: 312328 |
First added in WS229 [Jan 2011] Submitted to GenBank - Accession: EF043402 [Sep 2011] Gene set and Annotations are being worked on. | |
V | Pristionchus pacificus strain PS312 |
54126 | II | 172773083 bp | Yes | WashU/MPI Coverage: 8.92 Supercontigs: 18083 N50: 1244534 |
First added in WS194 [16 December 2010] Updated to the newest assembly and geneset in WS221 |
V | Steinernema carpocapsae | 34508 | III | 230 Mb | LANGEBIO / CalTech | [May 2011] Being assembled. | |
IV | Meloidogyne incognita | 6306 | III | 82095019 bp | INRA Supercontigs: 9538 N50: 83000 |
First added in WS205 Genes are not yet available. The official M.incognita genes are only available at INRA and their structure hasn't been made public. | |
IV | Meloidogyne hapla | 6305 | III | 53017507 bp | Yes | NCSU hapla.org Supercontigs: 3452 84000 |
First added in WS204 |
IV | Strongyloides ratti natural isolate |
34506 | III | 52638471 bp | Sanger Coverage: 70x Supercontigs: 2184 N50: 359029 |
[Jan 2011] draft assembly in GenBank First added in WS226 | |
IV | Bursaphelenchus xylophilus strain Ka4C1 |
6326 | III | 74561461 bp | Yes | Sanger Coverage: 13x Supercontigs: 5527 N50: 1158000 |
[Sept 2011] published in PLOS Pathogens [Nov 2011] First added in WS229 |
III | Ascaris suum natural isolate |
6253 | III | 272782664 bp | Yes | Davis Coverage: 70x Supercontig: 29831 N50: 407899 |
First added in WS229 [Oct 2011] integrated the Davis genome without a reference gene set. [Nov 2011] added a reference gene set |
III | Brugia malayi natural isolate |
6279 | III | 95814443 bp | Yes | TIGR -> WashU/Sanger Supercontigs: 27210 N50: 37841 |
First added in WS185 [Sept 2010] Currently using the old TIGR assembly. Waiting for WashU (did assembly) and Sanger (did gene models) to publish, then we will use the new assembly. [Dec 2010] merged Augustus gene predictions from Erich Schwarz into WS216 |
I | Trichinella spiralis | 6334 | III | 56779425 bp | Yes | WashU Supercontigs: 6863 N50: 3383625 |
[Sept 2010] Being assembled. [Feb 2011] published in Nature. [Mar 2011] First added in WS225 |
Genomes coming soon
Clade | Species | NCBI Taxon | Tier | Genome | Genes | Assembly | Comments |
---|---|---|---|---|---|---|---|
V | Caenorhabditis drosophilae | 96641 | III | WashU | [Sept 2010] being assembled | ||
V | Caenorhabditis elegans strain DR1035 |
6239 | III | 100 Mb | Mark Blaxter | status unclear | |
Onchocerca volvulus | 6282 | III | Sanger | ||||
Globodera pallida | 36090 | III | Sanger | ||||
Nippostrongylus brasiliensis | 27835 | III | Sanger | ||||
Strongyloides ransomi | 553534 | III | Sanger | ||||
Teladorsagia circumcincta | 45464 | III | Sanger | ||||
Trichuris muris | 70415 | III | Sanger |
Phylogeny
Given my understanding of the current phylogenetic literature (and based on personal communications with Karin Kiontke,David Fitch and Mark Blaxter), the correct guide tree would be:
((((((((((((C.briggsae,C.sp9),C.sp5),C.remanei),(C.sp11,C.brenneri)),C.elegans),(C.sp7,C.japonica)),C.angaria),(H.contortus,H.bacteriophora)),P.pacificus),((M.incognita,M.hapla), (S.ratti,B.xylophilus))),(A.suum,B.malayi)),T.spiralis);