WS158

From WormBaseWiki
Jump to: navigation, search

Release Letter

New release of WormBase WS158, Wormpep158 and Wormrna158 Thu May  4 13:10:59 BST 2006


WS158 was built by Paul Davis [paul.davis@wormbase.org]
======================================================================

This directory includes:
i)   database.WS158.*.tar.gz    -   compressed data for new release
ii)  models.wrm.WS158           -   the latest database schema (also in above database files)
iii) CHROMOSOMES/subdir         -   contains 3 files (DNA, GFF & AGP per chromosome)
iv)  WS158-WS157.dbcomp         -   log file reporting difference from last release
v)   wormpep158.tar.gz          -   full Wormpep distribution corresponding to WS158
vi)   wormrna158.tar.gz          -   latest WormRNA release containing non-coding RNA's in the genome
vii)  confirmed_genes.WS158.gz   -   DNA sequences of all genes confirmed by EST &/or cDNA
viii) cDNA2orf.WS158.gz           -   Latest set of ORF connections to each cDNA (EST, OST, mRNA)
ix)   gene_interpolated_map_positions.WS158.gz    - Interpolated map positions for each coding/RNA gene
x)    clone_interpolated_map_positions.WS158.gz   - Interpolated map positions for each clone
xi)   best_blastp_hits.WS158.gz  - for each C. elegans WormPep protein, lists Best blastp match to
                            human, fly, yeast, C. briggsae, and SwissProt & TrEMBL proteins.
xii)  best_blastp_hits_brigprot.WS158.gz   - for each C. briggsae protein, lists Best blastp match to
                                     human, fly, yeast, C. elegans, and SwissProt & TrEMBL proteins.
xiii) geneIDs.WS158.gz   - list of all current gene identifiers with CGC & molecular names (when known)
xiv)  PCR_product2gene.WS158.gz   - Mappings between PCR products and overlapping Genes


Release notes on the web:
-------------------------
http://wwwdev.sanger.ac.uk/Projects/C_elegans/WORMBASE



Primary databases used in build WS158
------------------------------------
brigdb : 20-- - updated
camace : 1970-01-01 - updated
citace : 2006-04-14 - updated
cshace : 2006-03-27 - updated
genace : 1970-01-01 - updated
stlace : 2006-04-14 - updated


Genome sequence composition:
----------------------------

        WS158           WS157           change
----------------------------------------------
a       32365775        32365775          +0
c       17779813        17779813          +0
g       17755968        17755968          +0
t       32365578        32365578          +0
n       0               0                 +0

Total   100267134       100267134         +0


Chromosomal Changes:
--------------------
There are no changes to the chromosome sequences in this release.


Gene data set (Live C.elegans genes 23752)
------------------------------------------
Molecular_info              22006 (92.6%)
Concise_description          4179 (17.6%)
Reference                    5058 (21.3%)
CGC_approved Gene name       8771 (36.9%)
RNAi_result                 19795 (83.3%)
Microarray_results          19127 (80.5%)
SAGE_transcript             18185 (76.6%)




Wormpep data set:
----------------------------

There are 20096 CDS in autoace, 23162 when counting 3066 alternate splice forms.

The 23162 sequences contain 10,165,980 base pairs in total.

Modified entries              29
Deleted entries                9
New entries                    8
Reappeared entries             1

Net change  +0



Status of entries: Confidence level of prediction (based on the amount of transcript evidence)
-------------------------------------------------
Confirmed              6686 (28.9%)     Every base of every exon has transcription evidence (mRNA, EST etc.)
Partially_confirmed   11442 (49.4%)     Some, but not all exon bases are covered by transcript evidence
Predicted              5034 (21.7%)     No transcriptional evidence at all



Status of entries: Protein Accessions
-------------------------------------
UniProtKB/Swiss-Prot accessions      0 (0.0%)
UniProtKB/TrEMBL accessions         0 (0.0%)



Status of entries: Protein_ID's in EMBL
---------------------------------------
Protein_id                0 (0.0%)



Gene <-> CDS,Transcript,Pseudogene connections (cgc-approved)
---------------------------------------------
Entries with CGC-approved Gene name   7078


GeneModel correction progress WS157 -> WS158
-----------------------------------------
Confirmed introns not in a CDS gene model;

                +---------+--------+
                | Introns | Change |
                +---------+--------+
Cambridge       |   3293  |    43  |
St Louis        |     15  |    -2  |
                +---------+--------+


Members of known repeat families that overlap predicted exons;

                +---------+--------+
                | Repeats | Change |
                +---------+--------+
Cambridge       |    578  |    -1  |
St Louis        |    750  |     1  |
                +---------+--------+



Synchronisation with GenBank / EMBL:
------------------------------------

No synchronisation issues


There are no gaps remaining in the genome sequence
---------------
For more info mail help@wormbase.org
-===================================================================================-



New Data:
---------


New Fixes:
----------


Known Problems:
--------------


Other Changes:
--------------

Proposed Changes / Forthcoming Data:
------------------------------------


Model Changes:
------------------------------------
* Added tags to RNAi and Gene_regulation for  when an RNAi experiment describes gene regulation

-===================================================================================-


Quick installation guide for UNIX/Linux systems
-----------------------------------------------

1. Create a new directory to contain your copy of WormBase,
        e.g. /users/yourname/wormbase

2. Unpack and untar all of the database.*.tar.gz files into
        this directory. You will need approximately 2-3 Gb of disk space.

3. Obtain and install a suitable acedb binary for your system
        (available from www.acedb.org).

4. Use the acedb 'xace' program to open your database, e.g.
        type 'xace /users/yourname/wormbase' at the command prompt.

5. See the acedb website for more information about acedb and
        using xace.

____________  END _____________