Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
Line 25: Line 25:

Revision as of 17:06, 13 August 2015

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings








July 2, 2015

Discussion Topics from IWM

  • Explaining job posting options via the forum in new Worm Breeder's Gazette article
  • Display of CRISPR data
    • Alleles with multiple lesions (one name, many mutations), need to be curated and mapped
  • Ontology term enrichment analysis, using ontologies other than gene ontology
    • Discussed on GO call yesterday; we can/should follow up with Paul Thomas
    • Would be good to have a single central tool/resource for enrichment analysis
    • PantherDB vs DAVID

WormBook Chapters

  • Paul S will review over next week and will provide feedback


  • Sending out e-mails to all labs/PIs reminding about new data forms
  • Could also do more personalized outreach to a smaller subset of PI's/labs
    • Could focus on PIs not at the IWM


  • Embryonic development, cell division timing
    • Sulston timing
    • Waterson datasets
    • Zhao cell lineage timing datasets
    • Bao lab?
  • New EM reconstructions from David Hall, Scott Emmons, etc.
  • Neuronal connectivity, collaborative database with Scott Emmons and colleagues

Citace upload

  • Curators submit data to Wen on Tuesday, July 28th

Taking over Gene Orienteer

  • Xiaodong and Sibyl working on

RNASeq data

  • Gary Williams only using high quality data, taking care of all curation (including meta data)
  • Public archive of rejected datasets?


  • Bill Mohler et al working on desktop application

July 9, 2015

Expression Pattern

  • Certain/uncertain qualifiers not annotated before some date
  • ~3,000 ?Expr_pattern objects without that annotation/tag
  • Daniela work on bringing up to date, hopefully won't take long

Expression Clusters to Anatomy & Life Stage annotations

  • Many large scale datasets with tissue-specific expression data
  • Much of what is in SPELL is not annotated to ?Anatomy or ?Life_stage terms
  • A goal: make expression data queryable via ?Anatomy terms/pages
  • Wen will make the model change proposal
  • We may not want to show explicitly in widget
  • There is a need for a condensed display of expression data (per gene)
  • Some datasets, like the EPIC data, explicitly mention each embryonic cell name
  • Need for a condensed ontology browser per gene/anatomy and gene/life stage

Proteomic analysis

  • Encyclopedia of Proteomic Dynamics, contacted Wen to share data
  • Wen will meet/discuss with group soon to determine what the goals are
  • It isn't clear what format the data has
  • Should include Gary Williams on discussions as he already processes Mass Spectrometry data

External Databases

  • To what extent can we take care of the data and display of other lab/publication databases
  • Many authors want to share and make links to their database/website via WormBase
  • What is the best way to handle large scale dataset sharing requests that don't necessarily (for the time being) fit our data model
  • We can take advantage of the "External Links" display on WBPaper pages to link out the the external databases affiliated with the paper, including a link to our FTP site with shared data files, maybe?
  • At least a stop gap measure until we can properly model the data

Cis-regulatory site nomenclature?

  • Barbara Meyer's lab published many "rex" (Recruitment Elements on X) sites, numbered sequentially
  • Tim Schedl wondering about others' thoughts/opinions on how to, possibly, standardize the names of cis-regulatory elements
  • Could be like gene names, without dash, e.g. "rex1", "rex2"
  • We may want to try "WBsf-" prefix, on all element names like "WBsf-rex1", although may be only used in-house


  • Were there any conclusions about phenotype lookup from the Allele-Phenotype form?
  • Chris spoke with Harald Hutter and others at the meeting about how to improve the lookup for phenotypes
  • Would be good to provide an explicit option to see phenotypes of related (or allele-affiliated) genes, perhaps by shared GO-term annotation
  • Need to think more on how to best compress display of phenotypes on gene pages as well
  • We do already provide links to the Variation and Gene pages (with Phenotypes displayed) in the term information box of the form

July 16, 2015

Anatomy term page expression

  • Raymond and Juancarlos are working on a display of genes that may be exclusively expressed in that anatomy object

Construct/Transgene curation

  • Karen trying to make the curation of constructs & transgenes easier
  • May consider merging the transgene and construct OA's
  • Possibly add a construct/transgene request functionality in other OA's
    • Would those need multiple input fields?
    • Karen would take care of the details

Molecule model

  • Exogenous/endogenous tags issue
  • Scraping data from external chemical databases versus adding biologically relevant data from papers
  • We pull data from, e.g. CHEBI, but not all molecules fall under their purview, e.g. proteins


  • Promotion and outreach
  • Micropublications discoverable in PubMed?
  • Publisher = WormBase? Caltech?
  • Minimal standards for publication?

July 23, 2015

Worm model for autism

  • Would want to take human variations implicated in autism; look for orthologous genes in C. elegans/nematodes and find/make synonymous mutations
  • Prioritize based on worm phenotypes
  • Generally applies to human disease variants

Database Migration

  • Thomas Down leaving WormBase in September
  • Moving ahead with Datomic
  • Good starting use-case for Datomic is querying Datomic-version of GeneACE
  • Need to make sure documentation for migration to Datomic is available and comprehensible
  • Point-people at each site: Sibyl @ OICR, Juancarlos @ Caltech
  • Now need to work out the mechanics of curating into Datomic

WormBase ParaSite

  • Reciprocal searches (WB <-> PS) are working well

Microarray datasets & modSeek

  • Some earlier datasets were re-processed (log-transformed, or re-annotated into original replicates instead of averaged results)
  • Need to try out different methods of processing raw-data (WB usually only takes in processed data)
  • One pipeline can feed data into SPELL and modSeek
  • It's difficult to establish/determine gold standards for assessing process performance

WormBook chapter reviewers

  • Send reviewer suggestions to Paul ASAP

C. elegans proteome in UniProt

  • Not a complete correspondence between WormBase and UniProt
  • Cases: UniProt has entry for a protein that differs by one or two amino acids from WormBase
    • Made from translations of what cDNAs etc. have been submitted
    • Partial data, e.g. partial cDNAs translated
  • Anything we can do to achieve greater consistency?
  • Protein data sets are important
  • Hinxton can use disrepancies as a flag to check on the gene/protein models
  • Would be good to have more reciprocal linkage between UniProt and WormBase
  • AVR-15, UniProt have two additional entries compared to Wormbase, differing in only 1 or 2 amino acids
  • Should we pick up different entries from UniProt and store/display the data; how to reconcile?
  • Possible use case: enter a UniProt ID into the BLAST/BLAT tool to identify WormBase matches

Gene Orienteer Data

  • Sibyl and Xiaodong looking at data and scripts from Gene Orienteer

Precanned queries for exclusive expression

  • Raymond & Juancarlos working on final details
  • Intent is to display genes that may be specifically/exclusively expressed in e.g. an anatomy term

Embryonic developmental timing

  • Sulston, Murray timing data sets for wild type embryonic cell division timing
  • Mutant data sets are coming in as well

Genetic Interaction Ontology (GIO)

  • Latest version of the GIO complete
  • Juancarlos and Chris built a "genetic interaction calculator" to determine interaction types from quantitative phenotype inequalities
  • Sending out to other MODs, etc.
  • Seems that although there is buy in conceptually, most curators can't afford the time for such detailed curation

Phenotype (ontology) display

  • Problems with display of phenotypes (and other annotations) on WormBase, as pointed out by several people at the IWM
  • Karen would like to start creating allele concise descriptions
  • We need compact, intelligently ordered annotation lists, not just alphabetical lists of ontology annotations
  • It would be good to show ancestors for relatedness and order
  • Chris working on Python script to display all annotations in the context of the entire ontology
  • We will need to see if this approach is feasible/beneficial

PATO-style EQ (Entitiy-Quality) phenotype annotations

  • It is clear that some phenotype annotations require details, e.g. "drug sensitivity" annotations should have the drug involved
    • This drug/molecule annotation should be present in the details if not directly in the term itself
  • Raises the issue of a number of cases where we need PATO-style EQ annotations, not just explicit phenotype terms for all possible scenarios
  • This would be helpful in annotating embryonic timing and identity phenotype datasets

July 30, 2015

Wen Chen helped Wen Chen

  • Wen Chen (lab) has list of genes to analyze
  • Wen Chen (WB) helped process the list
  • Would be good to have a simple CGI to process a list of genes in a variety of ways
  • Is this redundant with WormMine?
    • Not for data that doesn't exist (in WormMine) yet; more agile: could be up and running within a matter of days

Interconnections between WormBase and FlyBase

  • We could create more inter-connectivity between the two databases
  • Sharing concise descriptions of genes
  • Would be good for FlyBase and WB curators (Xiaodong?) to talk about where the links should exist at each site

August 6, 2015


WormMart machine

  • Wen wants to use the machine when WormMart retires

UniProt/wormbase gene class

  • need to talk to UniProt C.elegans curator

Raymond, Chris and Juancarlos are working on phenotype viewer

James: list of genes, enrich in what tissues

  • python code
  • biotype ontology, tissue expression from postgres as input

August 13, 2015

Phenotype term annotation summary graph

Goal: Provides an ontology-relationship-aware summary view of a gene's phenotype annotations. Prototype link aex-3 (fewer phenotypes) existing phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000086#-b-3> summary graph <>

daf-2 (lots of phenotypes) phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000898#-b-3> summary <>

Proposed development procedure:
  • standalone prototyping, commenting and improvements within the group.
  • implementation as a widget on dev site (juancarlos.wormbase.org), more testing and soliciting comments from selected end users.
  • committing to main site for general use.
Outline of graph processing:

To gather information:

  • WOBr query to collect all phenotypes annotated to the gene of interest.
  • WOBr query to collect all transitive relationships of the phenotypes from (1) towards the ontology root.

To simplify and to control graph size:

  • Remove all nodes (phenotype terms) that are not directly annotated with or at branching points where two branches of annotations merge (LCA lowest common ancestor, if you will).
  • Scale node size according to annotation count (includes inferred annotations).
  • Limit appearance of label to nodes above a given size (roughly big enough to hold term name).
  • Show annotation counts in mouse-over bubble, add hyperlink to term pages to each node

International Biocuration Conference

Propose to submit paper on Community Curation

  • Mary Ann happy to lead.
  • Daniela on board.