Difference between revisions of "Hinxton 2015.07- Meeting minutes"
From WormBaseWiki
Jump to navigationJump to searchLine 2: | Line 2: | ||
== July 9, 2015 == | == July 9, 2015 == | ||
+ | |||
+ | In attendance: KH, MP, BB, JL, PD, GW | ||
+ | Minuted by PD | ||
+ | Start 16:00 | ||
+ | |||
+ | === Thomas's Departure === | ||
+ | |||
+ | *Last day 18th Sept. | ||
+ | * Documentation Documentation Documentation..... | ||
+ | ** Ideal: Anyone can take an ACeDB database and load into a datomic instance | ||
+ | ** Documentation is on the database github repository | ||
+ | *** Not everyone has access at the moment. | ||
+ | *** There may be elements that Thomas assumes are so trivial that they have not bee documented | ||
+ | |||
+ | * Action items: | ||
+ | ** WS250 Gary to take a 250 acedb and try and work solely from the documentation | ||
+ | *** Any issues then use Thomas's expertise while he is still here and improve the documentation | ||
+ | ** WS249 could be used as a test case even though a datomic already exists???? | ||
+ | |||
+ | * Models updates are done via an augmented models file to make the conversion. | ||
+ | ** Simple model additions = Simple extension to the augmented file | ||
+ | **Complexed changes/reworking of classes = augmented file meeds munging.......need to look at this process | ||
+ | |||
+ | * Gary has been through some of the steps necessary to set up a Datomic database, but this was on his desktop machine that brought with it architecture issues so defaulted to an instance that Thomas had set up. | ||
+ | ** Better test would be to document/try and set up Datomic from a fresh AWS instance? can it be done from documentation? | ||
+ | |||
+ | * Tools development: Collonade and the AceDb like tree editor | ||
+ | **Written in ClosureScript (Closure compiled into JS) | ||
+ | |||
+ | === Job Advert === | ||
+ | |||
+ | *Originally a database developer, but now poss looking for some web dev skills | ||
+ | **Closure programming | ||
+ | |||
+ | |||
+ | === Parasite === | ||
+ | |||
+ | ==== Working on ==== | ||
+ | |||
+ | * bigwig track display for 3 cestodes | ||
+ | ** Tracks look of but working with various ideas for display | ||
+ | ** Grouped by study | ||
+ | ** Dynamic normalisation causes some display issues as currently normalises over loaded area and not complete genome/chromosome/sequence. | ||
+ | *** Webcode doesn't appear to have mechanism to sort scaling | ||
+ | ** Possible to compare tracks but the scale will invariably be different. | ||
+ | ** Need to get correct so as to not smear out points of biological interest. | ||
+ | *** How do other browsers perform? | ||
+ | **** Jbrowse appears to do same as e! browser | ||
+ | **** Looks at UCSC track hub handling | ||
+ | ** Possibly define a set of ubiquitous genes to normalise on? | ||
+ | ** Gary uses ana-1 for gleaning expression and RNASeq statistics as it appears to always be expressed at consistent level, but having a set of housekeeping genes might be a start. | ||
+ | ** Encode might has software for normalisation in rseq tools | ||
+ | *** Are we doing things wrong/different? No as e! does the same so not a major priority, but worth exploring. | ||
+ | |||
+ | * Imminent release of ParaSite 3 (~30th July) | ||
+ | ** Finalising | ||
+ | *** blast dbs | ||
+ | *** REST | ||
+ | *** healthchecks | ||
+ | ** Other | ||
+ | *** GO_term population, dropped by EG as have UniProt assistance, we do not so have to run this pipeline. | ||
+ | *** Ready for search dumps, if not critical healthcheck failures | ||
+ | *** GFF dumps all in place | ||
+ | *** Need to do testing | ||
+ | **** 2 New Species and 1 changed | ||
+ | *** MartBuild takes ~1week | ||
+ | |||
+ | * Better forum for announcing new features | ||
+ | ** News feed page/blog | ||
+ | ** What's new from old releases logged and poss turned into paragraphs for release blog? | ||
+ | ** Possibly re-work the front page to give more prominence to news/information on what's new in the release. | ||
+ | |||
+ | * Links back to WormBase central are in place for release 3. | ||
+ | |||
+ | * Look at google search optimisation and creating a more user friendly top entry in the google results as current google choices are a bit odd. | ||
+ | |||
+ | === WormBase Central === | ||
+ | |||
+ | * Gary | ||
+ | ** STAR RNASeq alignments for the brugia v4 assembly | ||
+ | ** Yet another C. elegans reference sequence error paper. | ||
+ | *** Some examples look odd.......deletion in intron where others have reported only a SNP. | ||
+ | *** Some inserts fix genes. | ||
+ | ** Working on test code for generating MODEncode expression graphs with the scientific scrutiny of J.A. | ||
+ | ** Gary will work with Sibyl once the science is happy to try and utilise the code for web usage. | ||
+ | |||
+ | * Michael | ||
+ | ** Ann Hart genes conserved across Caenorhabditis, checking for motifs | ||
+ | ** New brugia assembly | ||
+ | *** Chromosome naming and identification (Chromosomes scaffolded inc. X/Y + unlocalised contigs) (20-30 sequences where old assembly was >2000) | ||
+ | *** Training Augustus on old models (Min 2 exons with good intergenic spacing). | ||
+ | *** This is the FINAL version of the genome | ||
+ | **** reasonably finished state | ||
+ | **** Haplotype sequences in additional file | ||
+ | **** No bacterial contamination as Avril's code has been used. | ||
+ | **** 18% duplication in CEGMA genes.........old assembly <10% TIGR >20% | ||
+ | *** Gene set will be Projected from old assembly + gap filling and extension from Augustus. | ||
+ | **** Issues will be: | ||
+ | ***** Partially mapped manually curated genes | ||
+ | ***** Curated and NOT mapped (Use Exonerate to recover). | ||
+ | ***** Substring models......already has code for flattening them down. | ||
+ | |||
+ | * Paul | ||
+ | ** Working on retirement of the old Sanger CVS for pipeline code. | ||
+ | *** Have always has a GIT mirroring, this will become primary source | ||
+ | *** Need to consider the models and wspec in general. | ||
+ | *** C. remanei PX356 | ||
+ | **** Generated setup configs etc. | ||
+ | **** Have an e! database containing genome | ||
+ | **** Annotations provided by user assessed and found to be bad | ||
+ | ***** CDS features found with more exons that CDS features under a parent mRNA. | ||
+ | *** Trying to meet with the new Uniprot C. elegans curator | ||
+ | *** Finished looking at IWM genes | ||
+ | |||
+ | * Discussion about all the places we store work tickets | ||
+ | ** RT-sanger worm-bug@sanger.ac.uk >1000 of paper tickets | ||
+ | ** https://bitbucket.org/pauld/seqcur_ticketing-system 84 tickets | ||
+ | ** GitHub | ||
+ | *** github website - website issues and helpdesk | ||
+ | *** github pipeline - our code | ||
+ | *** https://github.com/Paul-Davis/seqcur_ticketing-system/issues 3 issues | ||
+ | **JIRA - ensembl genomes | ||
+ | *Need to limit to as few as possible | ||
+ | ** What can be lost | ||
+ | *** Paul will work through the bitbucket ones.....mostly tier II | ||
+ | *** JIRA kev is the main user of this system | ||
+ | |||
+ | * Need to discuss with the projects as a whole!!!!!!!! |
Revision as of 10:22, 10 July 2015
Contents
July 2015
July 9, 2015
In attendance: KH, MP, BB, JL, PD, GW Minuted by PD Start 16:00
Thomas's Departure
- Last day 18th Sept.
- Documentation Documentation Documentation.....
- Ideal: Anyone can take an ACeDB database and load into a datomic instance
- Documentation is on the database github repository
- Not everyone has access at the moment.
- There may be elements that Thomas assumes are so trivial that they have not bee documented
- Action items:
- WS250 Gary to take a 250 acedb and try and work solely from the documentation
- Any issues then use Thomas's expertise while he is still here and improve the documentation
- WS249 could be used as a test case even though a datomic already exists????
- WS250 Gary to take a 250 acedb and try and work solely from the documentation
- Models updates are done via an augmented models file to make the conversion.
- Simple model additions = Simple extension to the augmented file
- Complexed changes/reworking of classes = augmented file meeds munging.......need to look at this process
- Gary has been through some of the steps necessary to set up a Datomic database, but this was on his desktop machine that brought with it architecture issues so defaulted to an instance that Thomas had set up.
- Better test would be to document/try and set up Datomic from a fresh AWS instance? can it be done from documentation?
- Tools development: Collonade and the AceDb like tree editor
- Written in ClosureScript (Closure compiled into JS)
Job Advert
- Originally a database developer, but now poss looking for some web dev skills
- Closure programming
Parasite
Working on
- bigwig track display for 3 cestodes
- Tracks look of but working with various ideas for display
- Grouped by study
- Dynamic normalisation causes some display issues as currently normalises over loaded area and not complete genome/chromosome/sequence.
- Webcode doesn't appear to have mechanism to sort scaling
- Possible to compare tracks but the scale will invariably be different.
- Need to get correct so as to not smear out points of biological interest.
- How do other browsers perform?
- Jbrowse appears to do same as e! browser
- Looks at UCSC track hub handling
- How do other browsers perform?
- Possibly define a set of ubiquitous genes to normalise on?
- Gary uses ana-1 for gleaning expression and RNASeq statistics as it appears to always be expressed at consistent level, but having a set of housekeeping genes might be a start.
- Encode might has software for normalisation in rseq tools
- Are we doing things wrong/different? No as e! does the same so not a major priority, but worth exploring.
- Imminent release of ParaSite 3 (~30th July)
- Finalising
- blast dbs
- REST
- healthchecks
- Other
- GO_term population, dropped by EG as have UniProt assistance, we do not so have to run this pipeline.
- Ready for search dumps, if not critical healthcheck failures
- GFF dumps all in place
- Need to do testing
- 2 New Species and 1 changed
- MartBuild takes ~1week
- Finalising
- Better forum for announcing new features
- News feed page/blog
- What's new from old releases logged and poss turned into paragraphs for release blog?
- Possibly re-work the front page to give more prominence to news/information on what's new in the release.
- Links back to WormBase central are in place for release 3.
- Look at google search optimisation and creating a more user friendly top entry in the google results as current google choices are a bit odd.
WormBase Central
- Gary
- STAR RNASeq alignments for the brugia v4 assembly
- Yet another C. elegans reference sequence error paper.
- Some examples look odd.......deletion in intron where others have reported only a SNP.
- Some inserts fix genes.
- Working on test code for generating MODEncode expression graphs with the scientific scrutiny of J.A.
- Gary will work with Sibyl once the science is happy to try and utilise the code for web usage.
- Michael
- Ann Hart genes conserved across Caenorhabditis, checking for motifs
- New brugia assembly
- Chromosome naming and identification (Chromosomes scaffolded inc. X/Y + unlocalised contigs) (20-30 sequences where old assembly was >2000)
- Training Augustus on old models (Min 2 exons with good intergenic spacing).
- This is the FINAL version of the genome
- reasonably finished state
- Haplotype sequences in additional file
- No bacterial contamination as Avril's code has been used.
- 18% duplication in CEGMA genes.........old assembly <10% TIGR >20%
- Gene set will be Projected from old assembly + gap filling and extension from Augustus.
- Issues will be:
- Partially mapped manually curated genes
- Curated and NOT mapped (Use Exonerate to recover).
- Substring models......already has code for flattening them down.
- Issues will be:
- Paul
- Working on retirement of the old Sanger CVS for pipeline code.
- Have always has a GIT mirroring, this will become primary source
- Need to consider the models and wspec in general.
- C. remanei PX356
- Generated setup configs etc.
- Have an e! database containing genome
- Annotations provided by user assessed and found to be bad
- CDS features found with more exons that CDS features under a parent mRNA.
- Trying to meet with the new Uniprot C. elegans curator
- Finished looking at IWM genes
- Working on retirement of the old Sanger CVS for pipeline code.
- Discussion about all the places we store work tickets
- RT-sanger worm-bug@sanger.ac.uk >1000 of paper tickets
- https://bitbucket.org/pauld/seqcur_ticketing-system 84 tickets
- GitHub
- github website - website issues and helpdesk
- github pipeline - our code
- https://github.com/Paul-Davis/seqcur_ticketing-system/issues 3 issues
- JIRA - ensembl genomes
- Need to limit to as few as possible
- What can be lost
- Paul will work through the bitbucket ones.....mostly tier II
- JIRA kev is the main user of this system
- What can be lost
- Need to discuss with the projects as a whole!!!!!!!!