Difference between revisions of "Hinxton 2015.08- Meeting minutes"
From WormBaseWiki
Jump to navigationJump to searchLine 1: | Line 1: | ||
= August 2015 = | = August 2015 = | ||
+ | |||
+ | ==August 20, 2015 == | ||
+ | |||
+ | In attendance: KH, TD, GW, MP, JL | ||
+ | |||
+ | === Build 250 === | ||
+ | |||
+ | * Problems | ||
+ | ** Documentation error meant that crucial step at start of build was missed. | ||
+ | ** Result was that all non-elegans species were build using old data | ||
+ | ** Needed to re-start all T2 builds from scratch | ||
+ | ** Re-run of BLASTs, InterProScan, Compara | ||
+ | * Timing | ||
+ | ** Build will probably be one week late. | ||
+ | ** MP will be on vacation at end of build, GW to finish build | ||
+ | |||
+ | === Database migration === | ||
+ | |||
+ | * Curation tools (Collonade, editor) | ||
+ | ** Numerous fixes/changes based on feedback from Mary Ann | ||
+ | ** Added KeySet functionalty | ||
+ | * Importer/Exporter | ||
+ | ** Cannot re-implement AceDB's partial case-insensitivity in Datomic. | ||
+ | *** WBGene00000001 != WBgene00000001 in new system | ||
+ | ** GW to run the importer for WS250 | ||
+ | ** Dicussed cut-and-paste-into-shell method of running the importer | ||
+ | |||
+ | === Curation === | ||
+ | |||
+ | * Weird genes | ||
+ | ** Handful of genes in C.elegans that have contrived structures to accommodate an frameshift / in-frame stop | ||
+ | ** In some cases, the frameshft/stop is conserved in other Caenorhabditis. | ||
+ | ** GW doing further work to investigate what this might mean | ||
+ | * Mass spec data from Lamond lab | ||
+ | ** Aligning mass-spec peptides to genome, will appear as a genome browser track | ||
+ | * Further discussed capturing/registering when a curator has reviewed a gene model but not made a change | ||
+ | ** Should boil this down to a proposal and circulate with whole group | ||
+ | |||
+ | |||
+ | |||
Latest revision as of 10:57, 21 August 2015
Contents
August 2015
August 20, 2015
In attendance: KH, TD, GW, MP, JL
Build 250
- Problems
- Documentation error meant that crucial step at start of build was missed.
- Result was that all non-elegans species were build using old data
- Needed to re-start all T2 builds from scratch
- Re-run of BLASTs, InterProScan, Compara
- Timing
- Build will probably be one week late.
- MP will be on vacation at end of build, GW to finish build
Database migration
- Curation tools (Collonade, editor)
- Numerous fixes/changes based on feedback from Mary Ann
- Added KeySet functionalty
- Importer/Exporter
- Cannot re-implement AceDB's partial case-insensitivity in Datomic.
- WBGene00000001 != WBgene00000001 in new system
- GW to run the importer for WS250
- Dicussed cut-and-paste-into-shell method of running the importer
- Cannot re-implement AceDB's partial case-insensitivity in Datomic.
Curation
- Weird genes
- Handful of genes in C.elegans that have contrived structures to accommodate an frameshift / in-frame stop
- In some cases, the frameshft/stop is conserved in other Caenorhabditis.
- GW doing further work to investigate what this might mean
- Mass spec data from Lamond lab
- Aligning mass-spec peptides to genome, will appear as a genome browser track
- Further discussed capturing/registering when a curator has reviewed a gene model but not made a change
- Should boil this down to a proposal and circulate with whole group
August 14, 2015
In attendance: KH, TD, PD, GW, BB, MP
Database migration
- Security discussion
- Script will authenticate to the application using self-signed SSL certificates
- What level of granularity for the certificates? Per center? Per user? Per update type (e.g. a separate cert for each build step)
- Transaction log
- Web app now allow unrolling of recent transactions
- Functionality added to detect when unrolling of a T impacts on Ts made subsequently (e.g. T1 makes object, T2 refers to object, user attempts to undo T1).
- Environment
- GW had trouble previously getting Datomic running on EBI machine on nfd-mounted storage.
- The free storage engine with Datomic assumes/requires low-latency (ideally directly attached) storage
- GW to try again on acedb.ebi.ac.uk using /nfs/acedb/vol1 (purportedly low-latency storage)
- 2-way XEFS with different attached # models at either end
- e.g. Gene -> Variation (#Evidence), Variation -> Gene (#Molecular_change)
- New db puts #model annotation of the edges of the graph, so does not currently support this.
- Discussed possible workarounds
- Loading Ace into Datomic
- Some testing done by GW, works well
Build 250
- Status
- BLAST - all 9 species run, dumped and loaded
- Compara - finished and loaded
- RNASeq - in progress
- General problems/fire-fighting
- Transcript builder (mysteriously did not complete for two genomes)
- InterProScan (found a bug in the Ensembl code, fixed and provided patch)
- BLAST dumping (exceeded MySQL max connections. Needed to tweak server conf)
ParaSite
- Redesign of homepage
- Beta version demonstrated. Will circulate some other possible layouts
- New blog for ParaSite
- wbparasite.wordpress.com
Data integration / curation
- Brugia
- Produced initial annotation of new assembly.
- MP presented to Brugia analysis group in conference call
- C. elegans
- DB_remark clean up -> Brief_identification, pseudogenes (will go in as a note in ENA submissions)
- Classification pseudogenes. Will need a new unitary pseudogene tag in ?Pseudogene model
- Consolidation of UniProt product names into the database
- SVM paper => gene curation
- Proteomics data from Lamond lab
August 6, 2015
In attendance: KH, PD, MP, TD, BB
Build 250
- In progress. MP doing all 9 core species (no precedent for this).
- First build from github code base (no problems so far)
- How long will BLAST for all 9 species take (all BLASTX and BLASTP will need to be re-done, since all of the worm proteomes have changed)
- Problems
- RNAis with bad DNA_text. Cleaned and reported to Caltech
- NCBI taxonomy database at EBI changed location, config needed a patch
Data integration / curation
- B. malayi
- WS248 Brugia submission now live at ENA
- Gene prediction on new assembly
- Looking at RATT transfers from old assembly
- Using HAL mapping to project genes also works quite well.
- AUGUSTUS to fill in the gaps
- Handover will be Monday 10th August (during Brugia conference call)
- C. remanei
- Bad data. Submitter does not have time right now to clean. Will be deferred from this build
- General
- Hawaain strain - will go in as T3. BLASTable.
- Cleaning up T2. Getting rid of in-frame STOPs for P. pacific us
- Cleaning up Brief_identifcation for protein-coding and pseudogenes.
Database migration
- New interface for uploading ace files.
- Discussed potential strategies for schema updates
ParaSite
- New release out
- BLAST
- Fixed glitches in new service
- Discussed potential minor improvements to font page
- Analytics
- Redesign of the home page. Make it more WormBase compliant.
- Search box confusion