Difference between revisions of "WS181"
From WormBaseWiki
Jump to navigationJump to search (New page: <nowiki> New release of WormBase WS181, Wormpep181 and Wormrna181 Fri Sep 7 11:40:20 BST 2007 WS181 was built by Gary Williams ====================================================...) |
|||
Line 138: | Line 138: | ||
There are no gaps remaining in the genome sequence | There are no gaps remaining in the genome sequence | ||
--------------- | --------------- | ||
− | For more info mail | + | For more info mail help@wormbase.org |
-===================================================================================- | -===================================================================================- | ||
Latest revision as of 16:47, 21 December 2011
New release of WormBase WS181, Wormpep181 and Wormrna181 Fri Sep 7 11:40:20 BST 2007 WS181 was built by Gary Williams ====================================================================== This directory includes: i) database.WS181.*.tar.gz - compressed data for new release ii) models.wrm.WS181 - the latest database schema (also in above database files) iii) CHROMOSOMES/subdir - contains 3 files (DNA, GFF & AGP per chromosome) iv) WS181-WS180.dbcomp - log file reporting difference from last release v) wormpep181.tar.gz - full Wormpep distribution corresponding to WS181 vi) wormrna181.tar.gz - latest WormRNA release containing non-coding RNA's in the genome vii) confirmed_genes.WS181.gz - DNA sequences of all genes confirmed by EST &/or cDNA viii) cDNA2orf.WS181.gz - Latest set of ORF connections to each cDNA (EST, OST, mRNA) ix) gene_interpolated_map_positions.WS181.gz - Interpolated map positions for each coding/RNA gene x) clone_interpolated_map_positions.WS181.gz - Interpolated map positions for each clone xi) best_blastp_hits.WS181.gz - for each C. elegans WormPep protein, lists Best blastp match to human, fly, yeast, C. briggsae, and SwissProt & TrEMBL proteins. xii) best_blastp_hits_brigprot.WS181.gz - for each C. briggsae protein, lists Best blastp match to human, fly, yeast, C. elegans, and SwissProt & TrEMBL proteins. xiii) geneIDs.WS181.gz - list of all current gene identifiers with CGC & molecular names (when known) xiv) PCR_product2gene.WS181.gz - Mappings between PCR products and overlapping Genes Release notes on the web: ------------------------- http://www.wormbase.org/wiki/index.php/Release_notes Genome sequence composition: ---------------------------- WS181 WS180 change ---------------------------------------------- a 32365949 32365949 +0 c 17779887 17779887 +0 g 17756036 17756036 +0 t 32365750 32365750 +0 n 0 0 +0 - 0 0 +0 Total 100267622 100267622 +0 Chromosomal Changes: -------------------- There are no changes to the chromosome sequences in this release. Gene data set (Live C.elegans genes 29501) ------------------------------------------ Molecular_info 27790 (94.2%) Concise_description 4889 (16.6%) Reference 7211 (24.4%) CGC_approved Gene name 14732 (49.9%) RNAi_result 20690 (70.1%) Microarray_results 19945 (67.6%) SAGE_transcript 18645 (63.2%) Wormpep data set: ---------------------------- There are 20144 CDS in autoace, 23518 when counting 3374 alternate splice forms. The 23518 sequences contain base pairs in total. Modified entries 23 Deleted entries 0 New entries 11 Reappeared entries 0 Net change +11 Status of entries: Confidence level of prediction (based on the amount of transcript evidence) ------------------------------------------------- Confirmed 8109 (34.5%) Every base of every exon has transcription evidence (mRNA, EST etc.) Partially_confirmed 10746 (45.7%) Some, but not all exon bases are covered by transcript evidence Predicted 4663 (19.8%) No transcriptional evidence at all Status of entries: Protein Accessions ------------------------------------- UniProtKB/Swiss-Prot accessions 3439 (14.6%) UniProtKB/TrEMBL accessions 18829 (80.1%) Status of entries: Protein_ID's in EMBL --------------------------------------- Protein_id 22278 (94.7%) Gene <-> CDS,Transcript,Pseudogene connections (cgc-approved) --------------------------------------------- Entries with CGC-approved Gene name 13107 GeneModel correction progress WS180 -> WS181 ----------------------------------------- Confirmed introns not in a CDS gene model; +---------+--------+ | Introns | Change | +---------+--------+ Cambridge | 24 | 1 | St Louis | 157 | -1 | +---------+--------+ Members of known repeat families that overlap predicted exons; +---------+--------+ | Repeats | Change | +---------+--------+ Cambridge | 6 | 0 | St Louis | 6 | 0 | +---------+--------+ Synchronisation with GenBank / EMBL: ------------------------------------ No synchronisation issues There are no gaps remaining in the genome sequence --------------- For more info mail help@wormbase.org -===================================================================================- New Data: --------- The following BLAST databases were updated to the latest version: gadfly ipi_human yeast Genome sequence updates: ----------------------- New Fixes: ---------- Known Problems: --------------- Other Changes: -------------- Analysis objects are now being used for evidence of orthologs. Proposed Changes / Forthcoming Data: ------------------------------------- 2944 5' and 3' RACE sequences from the Vidal lab will be aligned to the C.elegans genome giving an improved view of the ends of many genes. Proposed Model Changes ---------------------- Added History tracking tags to ?Feature class Removed WashU_ID and Exelixis_ID as Variation name types Added Amber_UAG_or_Opal_UGA as final ambiguous mutation Model Changes: ------------------------------------ ##################################################################### #Molecular_change Nonsense UNIQUE Amber_UAG Text #Evidence Ochre_UAA Text #Evidence Opal_UGA Text #Evidence Ochre_UAA_or_Opal_UGA Text #Evidence Amber_UAG_or_Ochre_UAA Text #Evidence ###################################################################### ?Transcript Properties Transcript mRNA miRNA ncRNA rRNA scRNA snRNA snlRNA *new Small Nuclear Like RNA snoRNA stRNA tRNA u21RNA ################################################################################################# #Splice_confirmation cDNA ?Sequence // ?Sequence link to flag which cDNA is confirming EST ?Sequence // or falsifying the intron in question, added [031121 krb] OST mRNA Homology UTR ?Sequence False ?Sequence Inconsistent ?Sequence ################################################################################################# // the main Analysis class To hold information about Publications / Persons / other evidence and the used WormBase Release ?Analysis Source_database ?Database Based_on_WB_Release Int Based_on_DB_Release Text Description ?Text Reference ?Paper XREF Describes_analysis Conducted_by ?Person XREF Describes_analysis // not always the same as the author of the paper - eg Erich running OrthoMCL URL Text // eg www.treefam.org (or would this be covered by Source_database?) // changes to the Evidence #Evidence From_analysis ?Analysis // Paper class changes ?Paper Describes_analysis ?Analysis XREF Reference // Person class changes ?Person Conducted ?Analysis XREF Conducted_by -===================================================================================- Quick installation guide for UNIX/Linux systems ----------------------------------------------- 1. Create a new directory to contain your copy of WormBase, e.g. /users/yourname/wormbase 2. Unpack and untar all of the database.*.tar.gz files into this directory. You will need approximately 2-3 Gb of disk space. 3. Obtain and install a suitable acedb binary for your system (available from www.acedb.org). 4. Use the acedb 'xace' program to open your database, e.g. type 'xace /users/yourname/wormbase' at the command prompt. 5. See the acedb website for more information about acedb and using xace. ____________ END _____________