Difference between revisions of "Hinxton 2015.08- Meeting minutes"
From WormBaseWiki
Jump to navigationJump to search (Created page with "= August 2015 = == August 6, 2015 == In attendance: KH, PD, MP, TD, BB === General === * Build 250 ** In progress. MP doing all 9 core species (no precedent for this). **...") |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= August 2015 = | = August 2015 = | ||
+ | |||
+ | ==August 20, 2015 == | ||
+ | |||
+ | In attendance: KH, TD, GW, MP, JL | ||
+ | |||
+ | === Build 250 === | ||
+ | |||
+ | * Problems | ||
+ | ** Documentation error meant that crucial step at start of build was missed. | ||
+ | ** Result was that all non-elegans species were build using old data | ||
+ | ** Needed to re-start all T2 builds from scratch | ||
+ | ** Re-run of BLASTs, InterProScan, Compara | ||
+ | * Timing | ||
+ | ** Build will probably be one week late. | ||
+ | ** MP will be on vacation at end of build, GW to finish build | ||
+ | |||
+ | === Database migration === | ||
+ | |||
+ | * Curation tools (Collonade, editor) | ||
+ | ** Numerous fixes/changes based on feedback from Mary Ann | ||
+ | ** Added KeySet functionalty | ||
+ | * Importer/Exporter | ||
+ | ** Cannot re-implement AceDB's partial case-insensitivity in Datomic. | ||
+ | *** WBGene00000001 != WBgene00000001 in new system | ||
+ | ** GW to run the importer for WS250 | ||
+ | ** Dicussed cut-and-paste-into-shell method of running the importer | ||
+ | |||
+ | === Curation === | ||
+ | |||
+ | * Weird genes | ||
+ | ** Handful of genes in C.elegans that have contrived structures to accommodate an frameshift / in-frame stop | ||
+ | ** In some cases, the frameshft/stop is conserved in other Caenorhabditis. | ||
+ | ** GW doing further work to investigate what this might mean | ||
+ | * Mass spec data from Lamond lab | ||
+ | ** Aligning mass-spec peptides to genome, will appear as a genome browser track | ||
+ | * Further discussed capturing/registering when a curator has reviewed a gene model but not made a change | ||
+ | ** Should boil this down to a proposal and circulate with whole group | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==August 14, 2015 == | ||
+ | |||
+ | In attendance: KH, TD, PD, GW, BB, MP | ||
+ | |||
+ | === Database migration === | ||
+ | |||
+ | * Security discussion | ||
+ | ** Script will authenticate to the application using self-signed SSL certificates | ||
+ | ** What level of granularity for the certificates? Per center? Per user? Per update type (e.g. a separate cert for each build step) | ||
+ | * Transaction log | ||
+ | ** Web app now allow unrolling of recent transactions | ||
+ | ** Functionality added to detect when unrolling of a T impacts on Ts made subsequently (e.g. T1 makes object, T2 refers to object, user attempts to undo T1). | ||
+ | * Environment | ||
+ | ** GW had trouble previously getting Datomic running on EBI machine on nfd-mounted storage. | ||
+ | ** The free storage engine with Datomic assumes/requires low-latency (ideally directly attached) storage | ||
+ | ** GW to try again on acedb.ebi.ac.uk using /nfs/acedb/vol1 (purportedly low-latency storage) | ||
+ | * 2-way XEFS with different attached # models at either end | ||
+ | ** e.g. Gene -> Variation (#Evidence), Variation -> Gene (#Molecular_change) | ||
+ | ** New db puts #model annotation of the edges of the graph, so does not currently support this. | ||
+ | ** Discussed possible workarounds | ||
+ | * Loading Ace into Datomic | ||
+ | ** Some testing done by GW, works well | ||
+ | |||
+ | === Build 250 === | ||
+ | |||
+ | * Status | ||
+ | ** BLAST - all 9 species run, dumped and loaded | ||
+ | ** Compara - finished and loaded | ||
+ | ** RNASeq - in progress | ||
+ | * General problems/fire-fighting | ||
+ | ** Transcript builder (mysteriously did not complete for two genomes) | ||
+ | ** InterProScan (found a bug in the Ensembl code, fixed and provided patch) | ||
+ | ** BLAST dumping (exceeded MySQL max connections. Needed to tweak server conf) | ||
+ | |||
+ | === ParaSite === | ||
+ | |||
+ | * Redesign of homepage | ||
+ | ** Beta version demonstrated. Will circulate some other possible layouts | ||
+ | * New blog for ParaSite | ||
+ | ** wbparasite.wordpress.com | ||
+ | |||
+ | === Data integration / curation === | ||
+ | |||
+ | * Brugia | ||
+ | ** Produced initial annotation of new assembly. | ||
+ | **MP presented to Brugia analysis group in conference call | ||
+ | * C. elegans | ||
+ | ** DB_remark clean up -> Brief_identification, pseudogenes (will go in as a note in ENA submissions) | ||
+ | ** Classification pseudogenes. Will need a new unitary pseudogene tag in ?Pseudogene model | ||
+ | ** Consolidation of UniProt product names into the database | ||
+ | ** SVM paper => gene curation | ||
+ | ** Proteomics data from Lamond lab | ||
+ | |||
== August 6, 2015 == | == August 6, 2015 == | ||
Line 5: | Line 100: | ||
In attendance: KH, PD, MP, TD, BB | In attendance: KH, PD, MP, TD, BB | ||
− | === | + | === Build 250 === |
+ | |||
+ | * In progress. MP doing all 9 core species (no precedent for this). | ||
+ | * First build from github code base (no problems so far) | ||
+ | * How long will BLAST for all 9 species take (all BLASTX and BLASTP will need to be re-done, since all of the worm proteomes have changed) | ||
+ | * Problems | ||
+ | ** RNAis with bad DNA_text. Cleaned and reported to Caltech | ||
+ | ** NCBI taxonomy database at EBI changed location, config needed a patch | ||
+ | |||
+ | === Data integration / curation === | ||
+ | |||
+ | * B. malayi | ||
+ | ** WS248 Brugia submission now live at ENA | ||
+ | ** Gene prediction on new assembly | ||
+ | *** Looking at RATT transfers from old assembly | ||
+ | *** Using HAL mapping to project genes also works quite well. | ||
+ | *** AUGUSTUS to fill in the gaps | ||
+ | ** Handover will be Monday 10th August (during Brugia conference call) | ||
+ | * C. remanei | ||
+ | ** Bad data. Submitter does not have time right now to clean. Will be deferred from this build | ||
+ | * General | ||
+ | ** Hawaain strain - will go in as T3. BLASTable. | ||
+ | ** Cleaning up T2. Getting rid of in-frame STOPs for P. pacific us | ||
+ | **Cleaning up Brief_identifcation for protein-coding and pseudogenes. | ||
− | + | === Database migration === | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | * | + | * New interface for uploading ace files. |
− | + | * Discussed potential strategies for schema updates | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | * | ||
− | + | === ParaSite === | |
− | |||
− | |||
− | + | * New release out | |
− | + | * BLAST | |
− | + | ** Fixed glitches in new service | |
− | + | ** Discussed potential minor improvements to font page | |
− | + | * Analytics | |
− | + | * Redesign of the home page. Make it more WormBase compliant. | |
− | |||
** Search box confusion | ** Search box confusion |
Latest revision as of 10:57, 21 August 2015
Contents
August 2015
August 20, 2015
In attendance: KH, TD, GW, MP, JL
Build 250
- Problems
- Documentation error meant that crucial step at start of build was missed.
- Result was that all non-elegans species were build using old data
- Needed to re-start all T2 builds from scratch
- Re-run of BLASTs, InterProScan, Compara
- Timing
- Build will probably be one week late.
- MP will be on vacation at end of build, GW to finish build
Database migration
- Curation tools (Collonade, editor)
- Numerous fixes/changes based on feedback from Mary Ann
- Added KeySet functionalty
- Importer/Exporter
- Cannot re-implement AceDB's partial case-insensitivity in Datomic.
- WBGene00000001 != WBgene00000001 in new system
- GW to run the importer for WS250
- Dicussed cut-and-paste-into-shell method of running the importer
- Cannot re-implement AceDB's partial case-insensitivity in Datomic.
Curation
- Weird genes
- Handful of genes in C.elegans that have contrived structures to accommodate an frameshift / in-frame stop
- In some cases, the frameshft/stop is conserved in other Caenorhabditis.
- GW doing further work to investigate what this might mean
- Mass spec data from Lamond lab
- Aligning mass-spec peptides to genome, will appear as a genome browser track
- Further discussed capturing/registering when a curator has reviewed a gene model but not made a change
- Should boil this down to a proposal and circulate with whole group
August 14, 2015
In attendance: KH, TD, PD, GW, BB, MP
Database migration
- Security discussion
- Script will authenticate to the application using self-signed SSL certificates
- What level of granularity for the certificates? Per center? Per user? Per update type (e.g. a separate cert for each build step)
- Transaction log
- Web app now allow unrolling of recent transactions
- Functionality added to detect when unrolling of a T impacts on Ts made subsequently (e.g. T1 makes object, T2 refers to object, user attempts to undo T1).
- Environment
- GW had trouble previously getting Datomic running on EBI machine on nfd-mounted storage.
- The free storage engine with Datomic assumes/requires low-latency (ideally directly attached) storage
- GW to try again on acedb.ebi.ac.uk using /nfs/acedb/vol1 (purportedly low-latency storage)
- 2-way XEFS with different attached # models at either end
- e.g. Gene -> Variation (#Evidence), Variation -> Gene (#Molecular_change)
- New db puts #model annotation of the edges of the graph, so does not currently support this.
- Discussed possible workarounds
- Loading Ace into Datomic
- Some testing done by GW, works well
Build 250
- Status
- BLAST - all 9 species run, dumped and loaded
- Compara - finished and loaded
- RNASeq - in progress
- General problems/fire-fighting
- Transcript builder (mysteriously did not complete for two genomes)
- InterProScan (found a bug in the Ensembl code, fixed and provided patch)
- BLAST dumping (exceeded MySQL max connections. Needed to tweak server conf)
ParaSite
- Redesign of homepage
- Beta version demonstrated. Will circulate some other possible layouts
- New blog for ParaSite
- wbparasite.wordpress.com
Data integration / curation
- Brugia
- Produced initial annotation of new assembly.
- MP presented to Brugia analysis group in conference call
- C. elegans
- DB_remark clean up -> Brief_identification, pseudogenes (will go in as a note in ENA submissions)
- Classification pseudogenes. Will need a new unitary pseudogene tag in ?Pseudogene model
- Consolidation of UniProt product names into the database
- SVM paper => gene curation
- Proteomics data from Lamond lab
August 6, 2015
In attendance: KH, PD, MP, TD, BB
Build 250
- In progress. MP doing all 9 core species (no precedent for this).
- First build from github code base (no problems so far)
- How long will BLAST for all 9 species take (all BLASTX and BLASTP will need to be re-done, since all of the worm proteomes have changed)
- Problems
- RNAis with bad DNA_text. Cleaned and reported to Caltech
- NCBI taxonomy database at EBI changed location, config needed a patch
Data integration / curation
- B. malayi
- WS248 Brugia submission now live at ENA
- Gene prediction on new assembly
- Looking at RATT transfers from old assembly
- Using HAL mapping to project genes also works quite well.
- AUGUSTUS to fill in the gaps
- Handover will be Monday 10th August (during Brugia conference call)
- C. remanei
- Bad data. Submitter does not have time right now to clean. Will be deferred from this build
- General
- Hawaain strain - will go in as T3. BLASTable.
- Cleaning up T2. Getting rid of in-frame STOPs for P. pacific us
- Cleaning up Brief_identifcation for protein-coding and pseudogenes.
Database migration
- New interface for uploading ace files.
- Discussed potential strategies for schema updates
ParaSite
- New release out
- BLAST
- Fixed glitches in new service
- Discussed potential minor improvements to font page
- Analytics
- Redesign of the home page. Make it more WormBase compliant.
- Search box confusion