WormBase-Caltech Weekly Calls March 2016
From WormBaseWiki
Jump to navigationJump to searchMarch 3, 2016
Sentence-level indexing
- Chris playing around with flagging sentences as positive/negative for phenotype and flagging keywords as true or false positives
- It may be helpful to objectify sentences and generate unambiguous, unique IDs for sentences
- There will likely be difficulty in unambiguously pulling out sentences from re-processed papers
- Best identifier would be the sentence itself
- Sentence 'objects' could be connected to annotations in WB
Proteomic datasets
- Wen wants to systematically collect proteomic datasets
- Many are scattered across different resources
- Would also help ParaSite to index similar datasets for parasites
- MESH terms may help identify proteomic papers
MOD-human disease collaboration
- FlyBase working on a shared template for curating human disease
- SGD, ZFIN, MGI, FlyBase, and WormBase meeting at Caltech to discuss
- One priority is minimizing redundancy of effort
- There is a common portal for human diseases in progress
- Should consider Monarch initiative approach
AddGene & WB constructs
- Karen working with AddGene to pull in worm reagents (plasmids/constructs)
- AddGene does want to have better linking to WormBase
Metabolomic data
- Can we show chemical structures for metabolites?
- User (M. Witting) says metabolism people will want monoisotopic masses - these masses aren't available through ChEBI at the moment and it will be a while for them to provide them. Will find a formula and sources files for Juancarlos to implement a calculator in the OA.
- We already use, sync and collaborate with ChEBI
March 10, 2016
Adding WormBase-Parasite Paper to the Curation Pipeline (Kimberly)
- Developing new paper editor for Jane and Michael P. to approve parasite papers
- Adding species information to ALL papers; can be one or more species per paper
- Would like to then incorporate parasite papers into downstream SVM flagging pipeline and curation status form
- Species information could be used to keep curation stats clear
- We should discuss the curation and tracking of non-elegans species on the next site-wide call
- Might be best to have a separate ParaSite-specific curation status form; depends on use-case
- We may be able to just add new columns to the curation status form, Caltech vs EBI
- Michael and Jane want to curate gene function, e.g. GO; not sure what else
Dynamo-DB backend for Datomic
- REST-API now switched to dynamodb backend.
- Juancarlos could set up a form to try Datomic queries
WB FAQ page
- Need to request write access from Todd
- We should find out whether this page is used much. It needs updating considerably.
March 17, 2016
Xiaodong taking year leave
- Will try working with company in China for a year
- Xiaodong will start April 15
- Daniela taking over antibody curation, Chris & Daniela taking over gene regulation and sequence feature
- Xiaodong will discuss antibodies for disease with Ranjana
Tissue enrichment tool (Raymond)
- Should development continue with integration into WormBase? Yes, no one else is doing it
- GO enrichment tool in Intermine?
Text mining: brainstorming
- We can likely explore many more options in terms of text mining
- All possibly recognizable entities could have data filled and displayed to curators and/or end users
- What about showing PDF markup from papers in PubMed Central? There is increasing interest in participation
- We're trying sentence-level SVM for phenotype; will see how that works
- We can quote papers verbatim in annotations, but should probably use double quotes from now on
- We'll try to generate putative annotations automatically that can be confirmed or rejected by a curator or author
March 24, 2016
Outsourcing compute time
- SPELL requires 30 hours with 16GB memory
- Amazon cloud could offer 244GB memory at $3 per hour
- We already have an Amazon account
March 31, 2016
Reagent data submission form (from Nobert Perrimon)
- Data table provided as an example submission of reagent information at point of publication
- Several reagent types included
- We may want more specific information, e.g. sequence(s), relevant gene(s)
- Would we need/want other follow-up tables to fill in more specific information?
- Genetic variations should be added, including sequence
- WB curators should draw up a table of minimal information for their data type
- Chris will clean up and send around table for RNAi data submission
Proteomic papers (Wen)
- Searching title for "proteome"
- ~230 papers total
- ~187 proteomic papers might be curatable
- Many may not have expression clusters, but Wen did find new datasets