WS173

From WormBaseWiki
Jump to navigationJump to search

Release Notes

 New release of WormBase WS173, Wormpep173 and Wormrna173 Fri Mar 23 18:22:22 GMT 2007
 
 
 WS173 was built by Michael Han
 ======================================================================
 
 This directory includes:
 i)   database.WS173.*.tar.gz    -   compressed data for new release
 ii)  models.wrm.WS173           -   the latest database schema (also in above database files)
 iii) CHROMOSOMES/subdir         -   contains 3 files (DNA, GFF & AGP per chromosome)
 iv)  WS173-WS172.dbcomp         -   log file reporting difference from last release
 v)   wormpep173.tar.gz          -   full Wormpep distribution corresponding to WS173
 vi)   wormrna173.tar.gz          -   latest WormRNA release containing non-coding RNA's in the genome
 vii)  confirmed_genes.WS173.gz   -   DNA sequences of all genes confirmed by EST &/or cDNA
 viii) cDNA2orf.WS173.gz           -   Latest set of ORF connections to each cDNA (EST, OST, mRNA)
 ix)   gene_interpolated_map_positions.WS173.gz    - Interpolated map positions for each coding/RNA gene
 x)    clone_interpolated_map_positions.WS173.gz   - Interpolated map positions for each clone
 xi)   best_blastp_hits.WS173.gz  - for each C. elegans WormPep protein, lists Best blastp match to
                             human, fly, yeast, C. briggsae, and SwissProt & TrEMBL proteins.
 xii)  best_blastp_hits_brigprot.WS173.gz   - for each C. briggsae protein, lists Best blastp match to
                                      human, fly, yeast, C. elegans, and SwissProt & TrEMBL proteins.
 xiii) geneIDs.WS173.gz   - list of all current gene identifiers with CGC & molecular names (when known)
 xiv)  PCR_product2gene.WS173.gz   - Mappings between PCR products and overlapping Genes
 
 
 Release notes on the web:
 -------------------------
 http://www.wormbase.org/wiki/index.php/Release_notes
 
 
 
 Genome sequence composition:
 ----------------------------
 
         WS173           WS172           change
 ----------------------------------------------
 a       32365889        32365889          +0
 c       17779856        17779856          +0
 g       17756016        17756016          +0
 t       32365689        32365689          +0
 n       0               0                 +0
 
 Total   100267450       100267450         +0
 
 
 Chromosomal Changes:
 --------------------
 There are no changes to the chromosome sequences in this release.
 
 
 Gene data set (Live C.elegans genes 24019)
 ------------------------------------------
 Molecular_info              22327 (93%)
 Concise_description          4474 (18.6%)
 Reference                    6939 (28.9%)
 CGC_approved Gene name       9084 (37.8%)
 RNAi_result                 19856 (82.7%)
 Microarray_results          19143 (79.7%)
 SAGE_transcript             20048 (83.5%)
 
 there are 20106 CDS in autoace, 23258 when counting (3152) alternate splice_forms
 
 
 Status of entries: Confidence level of prediction (based on the amount of transcript evidence)
 -------------------------------------------------
 Confirmed              7805 (33.6%)     Every base of every exon has transcription evidence (mRNA, EST etc.)
 Partially_confirmed   10840 (46.6%)     Some, but not all exon bases are covered by transcript evidence
 Predicted              4613 (19.8%)     No transcriptional evidence at all
 
 
 
 Status of entries: Protein Accessions
 -------------------------------------
 UniProtKB/Swiss-Prot accessions   3504 (15.1%)
 UniProtKB/TrEMBL accessions     19421 (83.5%)
 
 
 
 Status of entries: Protein_ID's in EMBL
 ---------------------------------------
 Protein_id            22898 (98.5%)
 
 
 
 Gene <-> CDS,Transcript,Pseudogene connections (cgc-approved)
 ---------------------------------------------
 Entries with CGC-approved Gene name   7443
 
 GeneModel correction progress WS172 -> WS173
 -----------------------------------------
 Confirmed introns not in a CDS gene model;
 
                 +---------+--------+
                 | Introns | Change |
                 +---------+--------+
 Cambridge       |    184  |    -1  |
 St Louis        |    215  |     1  |
                 +---------+--------+
 
 
 Members of known repeat families that overlap predicted exons;
 
                 +---------+--------+
                 | Repeats | Change |
                 +---------+--------+
 Cambridge       |      6  |     0  |
 St Louis        |      6  |     0  |
                 +---------+--------+
 
 
 
 Synchronisation with GenBank / EMBL:
 ------------------------------------
 
 No synchronisation issues
 
 
 There are no gaps remaining in the genome sequence
 ---------------
 For more info mail help@wormbase.org
 -===================================================================================-
 
 
 
 New Data:
 ---------
 * added Nemagenetag dbinfo for crossreferencing with the Nemagene website
 * Deletion_validation data added to variations to show additional experimental evidence
 
 Genome sequence updates:
 -----------------------
 * change of ownership for some genomic clones between WashU and Sanger
 
 New Fixes:
 ----------
 
 
 Known Problems:
 ---------------
 
 
 Other Changes:
 --------------
 
 Proposed Changes / Forthcoming Data:
 -------------------------------------
 
 
 Model Changes:
 ------------------------------------
 * made Reference UNIQUE in RNAi class
 
 
 -===================================================================================-
 
 
 Quick installation guide for UNIX/Linux systems
 -----------------------------------------------
 
 1. Create a new directory to contain your copy of WormBase,
         e.g. /users/yourname/wormbase
 
 2. Unpack and untar all of the database.*.tar.gz files into
         this directory. You will need approximately 2-3 Gb of disk space.
 
 3. Obtain and install a suitable acedb binary for your system
         (available from www.acedb.org).
 
 4. Use the acedb 'xace' program to open your database, e.g.
         type 'xace /users/yourname/wormbase' at the command prompt.
 
 5. See the acedb website for more information about acedb and
         using xace.