Difference between revisions of "Hinxton 2015.07- Meeting minutes"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
= July 2015 =
= July 2015 =
== July 16, 2015 ==
In attendance: KH, MP, BB, JL, PD, GW, TD
Minuted by PD
Start 16:00
== July 9, 2015 ==
== July 9, 2015 ==

Revision as of 10:40, 16 July 2015

July 2015

July 16, 2015

In attendance: KH, MP, BB, JL, PD, GW, TD Minuted by PD Start 16:00

July 9, 2015

In attendance: KH, MP, BB, JL, PD, GW Minuted by PD Start 16:00

Database migration

  • Thomas last day 18th Sept.
  • Documentation Documentation Documentation.....
    • Ideal: Anyone can take an ACeDB database and load into a datomic instance
    • Documentation is on the database github repository
    • Not everyone has access to the db repository currently.
    • There may be parts of docs that need to be more proscriptive
  • Datomic set-up
    • Gary had trouble setting up Datomic on his desktop machine
    • Better test would be to document/try and set up Datomic from a fresh AWS instance? Can it be done from documentation?
  • Datomic build
    • WS250 GW to take a 250 acedb and try and work solely from the documentation
    • Any issues then use Thomas's expertise while he is still here and improve the documentation
    • WS249 could be used as a test case even though a datomic already exists (rather than waiting for WS250)
  • Models updates
    • Done via an augmented models file to make the conversion.
    • Simple model additions = Simple extension to the augmented file
    • Complexed changes/reworking of classes might not be so straightforward. Need to be aware of process for doing this
  • Tools development: Collonade and the AceDb like tree editor TrACeView
  • Thomas's replacement
    • Originally a database developer, but now poss looking for some web dev skills
    • Target a Clojure programmer?

Curation / Annotation

  • New Brugia assembly (MP)
    • state of the genome
      • "Final" version
      • "No bacterial contamination" as Avril's code has been used.
      • Chromosome naming and identification (Chromosomes scaffolded inc. X/Y + unlocalised contigs) (20-30 sequences where old assembly was >2000)
      • Haplotype sequences in additional file
    • Annotation
      • 16% (1.19 /gene) duplication in CEGMA genes.........old v3 assembly 7% (1.08/gene) TIGR >20%
      • Gene set will be Projected from old assembly with RATT + gap filling and extension from Augustus.
      • Training Augustus on old models (Min 2 exons with good intergenic spacing).
      • Issues will be:
        • Partially mapped manually curated genes
        • Curated and NOT mapped (Use Exonerate to recover).
        • Substring models......already has code for flattening them down.
    • STAR RNASeq alignments (GW)
  • New C. remanri genome (PD)
    • Strain PX356, from Philllips lab (published)
      • Generated setup configs etc.
      • Have an e! database containing genome
      • Annotations provided by user assessed and found to be bad
        • CDS features found with more exons that CDS features under a parent mRNA.
  • C. elegans reference genome
    • Yet another C. elegans reference sequence error paper (Zhang).
      • Some examples look odd.......deletion in intron where others have reported only a SNP.
      • Some inserts fix genes.
  • Re-working modENCODE RNASeq data
    • Working on test code for generating MODEncode expression graphs with the scientific scrutiny of Julie Ahringer (GW)
    • Wiill work with Sibyl once the science is happy to try and utilise the code for web usage.
  • Gene curation
    • Continues as usual
    • Finished looking at IWM genes (PD)
    • Organise meeting with new UniProt C. elegans curator
    • Brief discussion on capturing date-last-reviewed (to be followed up at next meeting)


  • RNASeq data display (JL & BB)
    • bigwig track display for RNASeq data for 3 cestodes
    • Tracks look of but working with various ideas for display
    • Grouped by study
    • Track scaling
      • Dynamic normalisation causes some display issues as currently normalises over loaded area and not complete genome/chromosome/sequence.
      • Possible to compare tracks but the scale will invariably be different.
      • Need to get scaling right so as to not smear out points of biological interest.
      • How do other browsers perform?
        • Jbrowse appears to do same as e! browser
        • Looks at UCSC track hub handling
    • Possibly define a set of ubiquitous genes to normalise on?
      • Gary uses ama-1 for gleaning expression and RNASeq statistics as it appears to always be expressed at consistent level, but having a set of housekeeping genes might be a start.
      • Encode might has software for normalisation in rseq tools
    • Are we doing things wrong/different? No as e! does the same so not a major priority, but worth exploring.
  • Imminent release of ParaSite 3 (~30th July) (BB + KH)
    • Done
      • FTP dumps
      • blast dbs
      • REST API
    • To do
      • MartBuild takes ~1week
      • Clearing critical healthchecks
      • Search dumps
      • Testing!
      • 2 New Species and 1 changed
    • Misc
      • Forum for announcing new features
        • News feed page/blog
        • What's new from old releases logged and poss turned into paragraphs for release blog?
        • Possibly re-work the front page to give more prominence to news/information on what's new in the release.
      • GO_term population, dropped by EG as have UniProt assistance, we may not run this pipeline in future releases.
      • Look at google search optimisation and creating a more user friendly top entry in the google results as current google choices are a bit odd.


  • Working on retirement of the old Sanger CVS for pipeline code.
    • Have always has a GIT mirroring, this will become primary source
    • Need to consider the models and wspec in general (tagging etc)
    • Need to communicate this with Caltech (new place to pick up models)
  • Discussion about all the places we store work tickets


  • Work for Ann Hart: checking for motifs conserved across Caenorhabditis with PCactus (MP)