WS181

From WormBaseWiki
Jump to navigationJump to search
 New release of WormBase WS181, Wormpep181 and Wormrna181 Fri Sep  7 11:40:20 BST 2007
 
 
 WS181 was built by Gary Williams
 ======================================================================
 
 This directory includes:
 i)   database.WS181.*.tar.gz    -   compressed data for new release
 ii)  models.wrm.WS181           -   the latest database schema (also in above database files)
 iii) CHROMOSOMES/subdir         -   contains 3 files (DNA, GFF & AGP per chromosome)
 iv)  WS181-WS180.dbcomp         -   log file reporting difference from last release
 v)   wormpep181.tar.gz          -   full Wormpep distribution corresponding to WS181
 vi)   wormrna181.tar.gz          -   latest WormRNA release containing non-coding RNA's in the genome
 vii)  confirmed_genes.WS181.gz   -   DNA sequences of all genes confirmed by EST &/or cDNA
 viii) cDNA2orf.WS181.gz           -   Latest set of ORF connections to each cDNA (EST, OST, mRNA)
 ix)   gene_interpolated_map_positions.WS181.gz    - Interpolated map positions for each coding/RNA gene
 x)    clone_interpolated_map_positions.WS181.gz   - Interpolated map positions for each clone
 xi)   best_blastp_hits.WS181.gz  - for each C. elegans WormPep protein, lists Best blastp match to
                             human, fly, yeast, C. briggsae, and SwissProt & TrEMBL proteins.
 xii)  best_blastp_hits_brigprot.WS181.gz   - for each C. briggsae protein, lists Best blastp match to
                                      human, fly, yeast, C. elegans, and SwissProt & TrEMBL proteins.
 xiii) geneIDs.WS181.gz   - list of all current gene identifiers with CGC & molecular names (when known)
 xiv)  PCR_product2gene.WS181.gz   - Mappings between PCR products and overlapping Genes
 
 
 Release notes on the web:
 -------------------------
 http://www.wormbase.org/wiki/index.php/Release_notes
 
 
 
 Genome sequence composition:
 ----------------------------
 
        	WS181       	WS180      	change
 ----------------------------------------------
 a    	32365949	32365949	  +0
 c    	17779887	17779887	  +0
 g    	17756036	17756036	  +0
 t    	32365750	32365750	  +0
 n    	0       	0       	  +0
 -    	0       	0       	  +0
 
 Total	100267622	100267622	  +0
 
 
 Chromosomal Changes:
 --------------------
 There are no changes to the chromosome sequences in this release.
 
 
 Gene data set (Live C.elegans genes 29501)
 ------------------------------------------
 Molecular_info              27790 (94.2%)
 Concise_description          4889 (16.6%)
 Reference                    7211 (24.4%)
 CGC_approved Gene name      14732 (49.9%)
 RNAi_result                 20690 (70.1%)
 Microarray_results          19945 (67.6%)
 SAGE_transcript             18645 (63.2%)
 
 
 
 
 Wormpep data set:
 ----------------------------
 
 There are 20144 CDS in autoace, 23518 when counting 3374 alternate splice forms.
 
 The 23518 sequences contain  base pairs in total.
 
 Modified entries      23
 Deleted entries       0
 New entries           11
 Reappeared entries    0
 
 Net change  +11
 
 
 
 
 Status of entries: Confidence level of prediction (based on the amount of transcript evidence)
 -------------------------------------------------
 Confirmed              8109 (34.5%)	Every base of every exon has transcription evidence (mRNA, EST etc.)
 Partially_confirmed   10746 (45.7%)	Some, but not all exon bases are covered by transcript evidence
 Predicted              4663 (19.8%)	No transcriptional evidence at all
 
 
 
 Status of entries: Protein Accessions
 -------------------------------------
 UniProtKB/Swiss-Prot accessions   3439 (14.6%)
 UniProtKB/TrEMBL accessions     18829 (80.1%)
 
 
 
 Status of entries: Protein_ID's in EMBL
 ---------------------------------------
 Protein_id            22278 (94.7%)
 
 
 
 Gene <-> CDS,Transcript,Pseudogene connections (cgc-approved)
 ---------------------------------------------
 Entries with CGC-approved Gene name  13107
 
 
 GeneModel correction progress WS180 -> WS181
 -----------------------------------------
 Confirmed introns not in a CDS gene model;
 
 		+---------+--------+
 		| Introns | Change |
 		+---------+--------+
 Cambridge	|     24  |     1  |
 St Louis 	|    157  |    -1  |
 		+---------+--------+
 
 
 Members of known repeat families that overlap predicted exons;
 
 		+---------+--------+
 		| Repeats | Change |
 		+---------+--------+
 Cambridge	|      6  |     0  |
 St Louis 	|      6  |     0  |
 		+---------+--------+
 
 
 
 Synchronisation with GenBank / EMBL:
 ------------------------------------
 
 No synchronisation issues
 
 
 There are no gaps remaining in the genome sequence
 ---------------
 For more info mail help@wormbase.org
 -===================================================================================-
 
 
 
 New Data:
 ---------
 
 The following BLAST databases were updated to the latest version:
 gadfly
 ipi_human
 yeast
 
 
 Genome sequence updates:
 -----------------------
 
 
 New Fixes:
 ----------
 
 
 Known Problems:
 ---------------
 
 
 Other Changes:
 --------------
 
 Analysis objects are now being used for evidence of orthologs.
 
 
 Proposed Changes / Forthcoming Data:
 -------------------------------------
 
 2944 5' and 3' RACE sequences from the Vidal lab will be aligned to
 the C.elegans genome giving an improved view of the ends of many
 genes.
 
 Proposed Model Changes
 ----------------------
 
 Added History tracking tags to ?Feature class
 Removed WashU_ID and Exelixis_ID as Variation name types
 Added Amber_UAG_or_Opal_UGA as final ambiguous mutation
 
 
 Model Changes:
 ------------------------------------
 
 #####################################################################
 
 #Molecular_change   Nonsense UNIQUE Amber_UAG Text #Evidence
                              Ochre_UAA Text #Evidence
                              Opal_UGA  Text #Evidence
                              Ochre_UAA_or_Opal_UGA Text #Evidence
                              Amber_UAG_or_Ochre_UAA Text #Evidence
 
 
 ######################################################################
 
 ?Transcript
 Properties    Transcript    mRNA
                 miRNA
                 ncRNA
                 rRNA
                 scRNA
                 snRNA
                 snlRNA *new Small Nuclear Like RNA
                 snoRNA
                 stRNA
                 tRNA
 		u21RNA
 
 #################################################################################################
 
 #Splice_confirmation     cDNA ?Sequence // ?Sequence link to flag which cDNA is confirming
            EST ?Sequence  // or falsifying the intron in question, added [031121 krb]
            OST
            mRNA
            Homology
            UTR ?Sequence
            False ?Sequence
            Inconsistent ?Sequence
 
 
 #################################################################################################
 
 // the main Analysis class To hold information about Publications / Persons / other evidence and the used WormBase Release
 ?Analysis    Source_database ?Database
                   Based_on_WB_Release Int
                   Based_on_DB_Release Text
                   Description ?Text
                   Reference ?Paper XREF Describes_analysis
                   Conducted_by ?Person XREF Describes_analysis  // not always the same as the author of the paper - eg Erich running OrthoMCL
                   URL Text   // eg www.treefam.org  (or would this be covered by Source_database?)
 
 
 // changes to the Evidence
 #Evidence From_analysis ?Analysis
 
 // Paper class changes
 ?Paper Describes_analysis ?Analysis XREF Reference
 
 // Person class changes
 ?Person Conducted ?Analysis XREF Conducted_by
 
 
 -===================================================================================-
 
 
 Quick installation guide for UNIX/Linux systems
 -----------------------------------------------
 
 1. Create a new directory to contain your copy of WormBase,
 	e.g. /users/yourname/wormbase
 
 2. Unpack and untar all of the database.*.tar.gz files into
 	this directory. You will need approximately 2-3 Gb of disk space.
 
 3. Obtain and install a suitable acedb binary for your system
 	(available from www.acedb.org).
 
 4. Use the acedb 'xace' program to open your database, e.g.
 	type 'xace /users/yourname/wormbase' at the command prompt.
 
 5. See the acedb website for more information about acedb and
 	using xace.
 
 ____________  END _____________