WormBase-Caltech Weekly Calls March 2016

From WormBaseWiki
Jump to navigationJump to search

March 3, 2016

Sentence-level indexing

  • Chris playing around with flagging sentences as positive/negative for phenotype and flagging keywords as true or false positives
  • It may be helpful to objectify sentences and generate unambiguous, unique IDs for sentences
  • There will likely be difficulty in unambiguously pulling out sentences from re-processed papers
  • Best identifier would be the sentence itself
  • Sentence 'objects' could be connected to annotations in WB

Proteomic datasets

  • Wen wants to systematically collect proteomic datasets
  • Many are scattered across different resources
  • Would also help ParaSite to index similar datasets for parasites
  • MESH terms may help identify proteomic papers

MOD-human disease collaboration

  • FlyBase working on a shared template for curating human disease
  • SGD, ZFIN, MGI, FlyBase, and WormBase meeting at Caltech to discuss
  • One priority is minimizing redundancy of effort
  • There is a common portal for human diseases in progress
  • Should consider Monarch initiative approach

AddGene & WB constructs

  • Karen working with AddGene to pull in worm reagents (plasmids/constructs)
  • AddGene does want to have better linking to WormBase

Metabolomic data

  • Can we show chemical structures for metabolites?
  • User (M. Witting) says metabolism people will want monoisotopic masses - these masses aren't available through ChEBI at the moment and it will be a while for them to provide them. Will find a formula and sources files for Juancarlos to implement a calculator in the OA.
  • We already use, sync and collaborate with ChEBI

March 10, 2016

Adding WormBase-Parasite Paper to the Curation Pipeline (Kimberly)

  • Developing new paper editor for Jane and Michael P. to approve parasite papers
  • Adding species information to ALL papers; can be one or more species per paper
  • Would like to then incorporate parasite papers into downstream SVM flagging pipeline and curation status form
  • Species information could be used to keep curation stats clear
  • We should discuss the curation and tracking of non-elegans species on the next site-wide call
  • Might be best to have a separate ParaSite-specific curation status form; depends on use-case
  • We may be able to just add new columns to the curation status form, Caltech vs EBI
  • Michael and Jane want to curate gene function, e.g. GO; not sure what else

Dynamo-DB backend for Datomic

  • REST-API now switched to dynamodb backend.
  • Juancarlos could set up a form to try Datomic queries

WB FAQ page

  • Need to request write access from Todd
  • We should find out whether this page is used much. It needs updating considerably.


March 17, 2016

Xiaodong taking year leave

  • Will try working with company in China for a year
  • Xiaodong will start April 15
  • Daniela taking over antibody curation, Chris & Daniela taking over gene regulation and sequence feature
  • Xiaodong will discuss antibodies for disease with Ranjana

Tissue enrichment tool (Raymond)

  • Should development continue with integration into WormBase? Yes, no one else is doing it
  • GO enrichment tool in Intermine?

Text mining: brainstorming

  • We can likely explore many more options in terms of text mining
  • All possibly recognizable entities could have data filled and displayed to curators and/or end users
  • What about showing PDF markup from papers in PubMed Central? There is increasing interest in participation
  • We're trying sentence-level SVM for phenotype; will see how that works
  • We can quote papers verbatim in annotations, but should probably use double quotes from now on
  • We'll try to generate putative annotations automatically that can be confirmed or rejected by a curator or author


March 24, 2016

Outsourcing compute time

  • SPELL requires 30 hours with 16GB memory
  • Amazon cloud could offer 244GB memory at $3 per hour
  • We already have an Amazon account


March 31, 2016

Reagent data submission form (from Nobert Perrimon)

  • Data table provided as an example submission of reagent information at point of publication
  • Several reagent types included
  • We may want more specific information, e.g. sequence(s), relevant gene(s)
  • Would we need/want other follow-up tables to fill in more specific information?
  • Genetic variations should be added, including sequence
  • WB curators should draw up a table of minimal information for their data type
  • Chris will clean up and send around table for RNAi data submission

Proteomic papers (Wen)

  • Searching title for "proteome"
  • ~230 papers total
  • ~187 proteomic papers might be curatable
  • Many may not have expression clusters, but Wen did find new datasets