Difference between revisions of "WS169"
From WormBaseWiki
Jump to navigationJump to searchLine 162: | Line 162: | ||
There are no gaps remaining in the genome sequence | There are no gaps remaining in the genome sequence | ||
--------------- | --------------- | ||
− | For more info mail | + | For more info mail help@wormbase.org |
-===================================================================================- | -===================================================================================- | ||
Latest revision as of 10:56, 21 December 2011
Contents
Release Letter
New release of WormBase WS169, Wormpep169 and Wormrna169 Fri Dec 22 15:09:52 GMT 2006 WS169 was built by Paul ====================================================================== This directory includes: i) database.WS169.*.tar.gz - compressed data for new release ii) models.wrm.WS169 - the latest database schema (also in above database files) iii) CHROMOSOMES/subdir - contains 3 files (DNA, GFF & AGP per chromosome) iv) WS169-WS168.dbcomp - log file reporting difference from last release v) wormpep169.tar.gz - full Wormpep distribution corresponding to WS169 vi) wormrna169.tar.gz - latest WormRNA release containing non-coding RNA's in the genome vii) confirmed_genes.WS169.gz - DNA sequences of all genes confirmed by EST &/or cDNA viii) cDNA2orf.WS169.gz - Latest set of ORF connections to each cDNA (EST, OST, mRNA) ix) gene_interpolated_map_positions.WS169.gz - Interpolated map positions for each coding/RNA gene x) clone_interpolated_map_positions.WS169.gz - Interpolated map positions for each clone xi) best_blastp_hits.WS169.gz - for each C. elegans WormPep protein, lists Best blastp match to human, fly, yeast, C. briggsae, and SwissProt & TrEMBL proteins. xii) best_blastp_hits_brigprot.WS169.gz - for each C. briggsae protein, lists Best blastp match to human, fly, yeast, C. elegans, and SwissProt & TrEMBL proteins. xiii) geneIDs.WS169.gz - list of all current gene identifiers with CGC & molecular names (when known) xiv) PCR_product2gene.WS169.gz - Mappings between PCR products and overlapping Genes Release notes on the web: ------------------------- http://www.wormbase.org/wiki/index.php/Release_notes Genome sequence composition: ---------------------------- WS169 WS168 change ---------------------------------------------- a 32365889 32365889 +0 c 17779856 17779857 -1 g 17756016 17756012 +4 t 32365689 32365686 +3 n 0 0 +0 Total 100267450 100267444 +6 Total number of bases has increased due to a number of genome sequence changes see notes further down. Chromosomal Changes: -------------------- Chromosome: I 7140372 7140372 1 -> 7140372 7140372 1 13168444 13168443 0 -> 13168444 13168444 1 Chromosome: II 8870302 8870301 0 -> 8870302 8870302 1 10828222 10828222 1 -> 10828223 10828223 1 11864669 11864668 0 -> 11864670 11864670 1 Chromosome: III 3483743 3483742 0 -> 3483743 3483743 1 Chromosome: IV 12082434 12082433 0 -> 12082434 12082434 1 12124684 12124704 21 -> 12124685 12124703 19 Chromosome: V 13100100 13100109 10 -> 13100100 13100111 12 13100695 13100695 1 -> 13100697 13100696 0 16896532 16896531 0 -> 16896533 16896533 1 Chromosome: X 9890865 9890864 0 -> 9890865 9890865 1 11588899 11588899 1 -> 11588900 11588899 0 14847044 14847043 0 -> 14847044 14847044 1 Gene data set (Live C.elegans genes 23964) ------------------------------------------ Molecular_info 22264 (92.9%) Concise_description 4288 (17.9%) Reference 6655 (27.8%) CGC_approved Gene name 8907 (37.2%) RNAi_result 19841 (82.8%) Microarray_results 19132 (79.8%) SAGE_transcript 20026 (83.6%) Wormpep data set: ---------------------------- There are 20083 CDS in autoace, 23221 when counting 3138 alternate splice forms. The 23221 sequences contain 10,183,692 base pairs in total. Modified entries 32 Deleted entries 2 New entries 10 Reappeared entries 1 Net change +9 Status of entries: Confidence level of prediction (based on the amount of transcript evidence) ------------------------------------------------- Confirmed 7822 (33.7%) Every base of every exon has transcription evidence (mRNA, EST etc.) Partially_confirmed 10737 (46.2%) Some, but not all exon bases are covered by transcript evidence Predicted 4662 (20.1%) No transcriptional evidence at all Status of entries: Protein Accessions ------------------------------------- UniProtKB/Swiss-Prot accessions 3270 (14.1%) UniProtKB/TrEMBL accessions 19461 (83.8%) Status of entries: Protein_ID's in EMBL --------------------------------------- Protein_id 22731 (97.9%) Gene <-> CDS,Transcript,Pseudogene connections (cgc-approved) --------------------------------------------- Entries with CGC-approved Gene name 7261 GeneModel correction progress WS168 -> WS169 ----------------------------------------- Confirmed introns not in a CDS gene model; +---------+--------+ | Introns | Change | +---------+--------+ Cambridge | 15 | 0 | St Louis | 11 | 1 | +---------+--------+ Members of known repeat families that overlap predicted exons; +---------+--------+ | Repeats | Change | +---------+--------+ Cambridge | 6 | 0 | St Louis | 6 | 0 | +---------+--------+ Synchronisation with GenBank / EMBL: ------------------------------------ CHROMOSOME_IV sequence Z68507 There are no gaps remaining in the genome sequence --------------- For more info mail help@wormbase.org -===================================================================================- New Data: --------- * Incorporation of new SAGE libraries. Genome sequence updates: ----------------------- C27C12 Insertion attcagcaagctattctagTctctcgactcatacgtcattt C34B4 Insertion ttgagtttgatggttcaactgaaatTggtcagtgtc C34B4 Insertion ggtcagtgtcTttcttcactttgcctgaaacttgga C34B4 Deletion gagaataaacttcattAcctcagatattcctgtt C34C12 Insertion gataaatccgacttggcgggGaagttcttgccgccctgg F09C6 Insertion tatccaaaaaaatcctatTtgaaggaagttcagaagctatac F11A10 Insertion ttttgaagatgtacactGctttgcagcgacaaatgag F21G4 Insertion accattgggaattcccggagGaaaagtgtgatgttttctttaaat F22G12 Insertion attccctgaacttcggagcaatCactcatcaacgatcagctcgac F52D10 Deletion cacatcacagtatttattcCcaacatcaatctttaacggta K10D3 Deletion tatcgagcttgaagtaccgtGttcaattggagcctcagag K10D3 Insertion tatcgagcttgaagtaccgtTttcaattggagcctcagag K12D12 Insertion gaaaaatttggcttgggCcacgaatctttctacgggcggt M18 Deletion aatattttacacaatcaccCaatttttatatttatcgttc M18 Deletion atttttatatttatcgttcCtactactttcctttctcgtga T07D4 Insertion aacctgatcccggcgGgcgttgacgtgcttttaa VM106R Deletion gtcctgatgatggcgagTgatacacgtcgcga VM106R Insertion gtcctgatgatggcgagCgatacacgtcgcga NET +6 New Fixes: ---------- * Pictar binding sites have been updated as they have been discrepant in past releases. Known Problems: --------------- Blast on Other Changes: -------------- Proposed Changes / Forthcoming Data: ------------------------------------- Model Changes: ------------------------------------ Added ?Homology_group Group_type UNIQUE OrthoMCL_group for Erich and ?Gene Orthologue_other ?Database ?Database_field UNIQUE ?Accession_number ?Species #Evidence so that we can connect Ortholog pairs to species outside those that we create ?Gene objects for eg Human -===================================================================================- Quick installation guide for UNIX/Linux systems ----------------------------------------------- 1. Create a new directory to contain your copy of WormBase, e.g. /users/yourname/wormbase 2. Unpack and untar all of the database.*.tar.gz files into this directory. You will need approximately 2-3 Gb of disk space. 3. Obtain and install a suitable acedb binary for your system (available from www.acedb.org). 4. Use the acedb 'xace' program to open your database, e.g. type 'xace /users/yourname/wormbase' at the command prompt. 5. See the acedb website for more information about acedb and using xace. ____________ END _____________