Difference between revisions of "C.remanei"

From WormBaseWiki
Jump to navigationJump to search
(minor edits and formatting)
Line 1: Line 1:
 
= Overview =
 
= Overview =
 
== current data ==
 
== current data ==
The C.remanei data available in WormBase is based on the [http://genome.wustl.edu/pub/organism/Invertebrates/Caenorhabditis_remanei/assembly/Caenorhabditis_remanei-1.0/  Caenorhabditis_remanei-1.0] 6x assembly from WashU and genepredictions using the WashU merged gene prediction pipeline. Submitted to EMBL/GenBak was only the genomic assembly
+
The C.remanei data available in WormBase is based on the [http://genome.wustl.edu/pub/organism/Invertebrates/Caenorhabditis_remanei/assembly/Caenorhabditis_remanei-1.0/  Caenorhabditis_remanei-1.0] 6x assembly from WashU and gene predictions using the WashU merged gene prediction pipeline. These gene predictions have not been submitted to EMBL/GenBank, but the assembly has.
  
== future data ==
+
== Future Data ==
 
* a newer 9.2x C.remanei assembly is available from WashU [http://genome.wustl.edu/pub/organism/Invertebrates/Caenorhabditis_remanei/assembly/Caenorhabditis_remanei-15.0.1/ Caenorhabditis_remanei-15.0.1]
 
* a newer 9.2x C.remanei assembly is available from WashU [http://genome.wustl.edu/pub/organism/Invertebrates/Caenorhabditis_remanei/assembly/Caenorhabditis_remanei-15.0.1/ Caenorhabditis_remanei-15.0.1]
 
* the [[NGASP]] project plans produce a new C.remanei gene set
 
* the [[NGASP]] project plans produce a new C.remanei gene set
Line 10: Line 10:
  
 
= State of Integration =
 
= State of Integration =
as of WS188
+
 
 +
As of WS188:
  
 
== ACeDB ==
 
== ACeDB ==
 
The Genes, Proteins as well as orthology predictions were imported in WS186 are are now automatically updated each build. Blast homologies to C.remanei proteins are included for C.elegans and C.briggsae proteins.
 
The Genes, Proteins as well as orthology predictions were imported in WS186 are are now automatically updated each build. Blast homologies to C.remanei proteins are included for C.elegans and C.briggsae proteins.
 
=== known bugs ===
 
=== known bugs ===
* The protein sequences were not imported (fixed in WS188)
+
* The protein sequences were not imported (fixed in WS188). Protein pages and proper cross links between them and their corresponding gene pages will be available in WS189.
 
* some blast hits from C.elegans/C.briggsae don't link to the correct protein (fix planned for WS189)
 
* some blast hits from C.elegans/C.briggsae don't link to the correct protein (fix planned for WS189)
  
Line 22: Line 23:
 
=== known bugs ===
 
=== known bugs ===
 
* the live site can't show remanei gene pages (fixed on dev)
 
* the live site can't show remanei gene pages (fixed on dev)
* the small GBrowse window on the gene page doesn't show anything
+
* the small GBrowse window on the gene page doesn't show anything.
 
** that is due to renaming of the C.remanei contigs to resolve conflicts with C.brenneri
 
** that is due to renaming of the C.remanei contigs to resolve conflicts with C.brenneri
 
** an updated GFF file is planned for WS189 which will solve the GBrowse issues
 
** an updated GFF file is planned for WS189 which will solve the GBrowse issues
  
== flatfiles ==
+
== Flat Files ==
 
=== GFF ===
 
=== GFF ===
 
A GFF file including CDSes and blast hits were created for WS186 and are available at [ftp://ftp.sanger.ac.uk/pub2/wormbase/WS186/CHROMOSOMES/remagff186/ remagff186]
 
A GFF file including CDSes and blast hits were created for WS186 and are available at [ftp://ftp.sanger.ac.uk/pub2/wormbase/WS186/CHROMOSOMES/remagff186/ remagff186]
 
The GFF file is not yet complete and doesn't include BLAT data and Gene Spans. An updated version is planned for WS189.
 
The GFF file is not yet complete and doesn't include BLAT data and Gene Spans. An updated version is planned for WS189.
=== proteins ===
+
=== Proteins ===
 
A test-version of C.remanei proteins, called remapep was done for WS186 and can be found at [ftp://ftp.sanger.ac.uk/pub2/wormbase/WS186/remapep186.gz remapep186]
 
A test-version of C.remanei proteins, called remapep was done for WS186 and can be found at [ftp://ftp.sanger.ac.uk/pub2/wormbase/WS186/remapep186.gz remapep186]
 
The generation of history files and addition of all available IDs is planned for WS189.
 
The generation of history files and addition of all available IDs is planned for WS189.
Line 36: Line 37:
 
The contigs used for annotation can be found at [ftp://ftp.sanger.ac.uk/pub2/wormbase/WS186/CHROMOSOMES/remagff186/ remagff186]. The sequences should be identically to the WashU-1.0 assembly, only contig names were prefixed with "Crem_" .
 
The contigs used for annotation can be found at [ftp://ftp.sanger.ac.uk/pub2/wormbase/WS186/CHROMOSOMES/remagff186/ remagff186]. The sequences should be identically to the WashU-1.0 assembly, only contig names were prefixed with "Crem_" .
  
== curation ==
+
== Curation ==
 
* gene models will be manually curated by WashU.
 
* gene models will be manually curated by WashU.
 
* gene names are already curated in the traditional ex-CGC / WormGeneNames way
 
* gene names are already curated in the traditional ex-CGC / WormGeneNames way
 
* for the current gene models only systematic errors will be fixed, as we expect NGASP to release a new geneset any day
 
* for the current gene models only systematic errors will be fixed, as we expect NGASP to release a new geneset any day
  
== history ==
+
== History ==
 
=== 2005 ===
 
=== 2005 ===
 
preliminary data was made available as flatfiles and through the WormBase GBrowse
 
preliminary data was made available as flatfiles and through the WormBase GBrowse

Revision as of 09:16, 28 February 2008

Overview

current data

The C.remanei data available in WormBase is based on the Caenorhabditis_remanei-1.0 6x assembly from WashU and gene predictions using the WashU merged gene prediction pipeline. These gene predictions have not been submitted to EMBL/GenBank, but the assembly has.

Future Data

  • a newer 9.2x C.remanei assembly is available from WashU Caenorhabditis_remanei-15.0.1
  • the NGASP project plans produce a new C.remanei gene set
  • manual curation is supposed to start as soon as NGASP predictions for the new assembly are available.
  • EMBL/GenBank submission is supposed to start together with manual curation.

State of Integration

As of WS188:

ACeDB

The Genes, Proteins as well as orthology predictions were imported in WS186 are are now automatically updated each build. Blast homologies to C.remanei proteins are included for C.elegans and C.briggsae proteins.

known bugs

  • The protein sequences were not imported (fixed in WS188). Protein pages and proper cross links between them and their corresponding gene pages will be available in WS189.
  • some blast hits from C.elegans/C.briggsae don't link to the correct protein (fix planned for WS189)

Website

The default code is used to show the ACeDB data and there is a GBrowse version available to show the GFF annotation. It is also included in the three-way genomic alignment viewer on the dev site.

known bugs

  • the live site can't show remanei gene pages (fixed on dev)
  • the small GBrowse window on the gene page doesn't show anything.
    • that is due to renaming of the C.remanei contigs to resolve conflicts with C.brenneri
    • an updated GFF file is planned for WS189 which will solve the GBrowse issues

Flat Files

GFF

A GFF file including CDSes and blast hits were created for WS186 and are available at remagff186 The GFF file is not yet complete and doesn't include BLAT data and Gene Spans. An updated version is planned for WS189.

Proteins

A test-version of C.remanei proteins, called remapep was done for WS186 and can be found at remapep186 The generation of history files and addition of all available IDs is planned for WS189.

DNA

The contigs used for annotation can be found at remagff186. The sequences should be identically to the WashU-1.0 assembly, only contig names were prefixed with "Crem_" .

Curation

  • gene models will be manually curated by WashU.
  • gene names are already curated in the traditional ex-CGC / WormGeneNames way
  • for the current gene models only systematic errors will be fixed, as we expect NGASP to release a new geneset any day

History

2005

preliminary data was made available as flatfiles and through the WormBase GBrowse

WS186

integration of C.remanei data into the canonical ACeDB database

(future) WS189

  • integration of C.remanei into the regular build pipeline
  • website fixes

(future) WS190

  • should be in a state to get included into the frozen release