WormBase-Caltech Weekly Calls

From WormBaseWiki
Jump to navigationJump to search

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings


2015 Meetings

January

February

March

April

May

June

July


July 2, 2015

Discussion Topics from IWM

  • Explaining job posting options via the forum in new Worm Breeder's Gazette article
  • Display of CRISPR data
    • Alleles with multiple lesions (one name, many mutations), need to be curated and mapped
  • Ontology term enrichment analysis, using ontologies other than gene ontology
    • Discussed on GO call yesterday; we can/should follow up with Paul Thomas
    • Would be good to have a single central tool/resource for enrichment analysis
    • PantherDB vs DAVID

WormBook Chapters

  • Paul S will review over next week and will provide feedback

Outreach

  • Sending out e-mails to all labs/PIs reminding about new data forms
  • Could also do more personalized outreach to a smaller subset of PI's/labs
    • Could focus on PIs not at the IWM

Anatomy

  • Embryonic development, cell division timing
    • Sulston timing
    • Waterson datasets
    • Zhao cell lineage timing datasets
    • Bao lab?
  • New EM reconstructions from David Hall, Scott Emmons, etc.
  • Neuronal connectivity, collaborative database with Scott Emmons and colleagues

Citace upload

  • Curators submit data to Wen on Tuesday, July 28th

Taking over Gene Orienteer

  • Xiaodong and Sibyl working on

RNASeq data

  • Gary Williams only using high quality data, taking care of all curation (including meta data)
  • Public archive of rejected datasets?

WormGuides

  • Bill Mohler et al working on desktop application

July 9, 2015

Expression Pattern

  • Certain/uncertain qualifiers not annotated before some date
  • ~3,000 ?Expr_pattern objects without that annotation/tag
  • Daniela work on bringing up to date, hopefully won't take long

Expression Clusters to Anatomy & Life Stage annotations

  • Many large scale datasets with tissue-specific expression data
  • Much of what is in SPELL is not annotated to ?Anatomy or ?Life_stage terms
  • A goal: make expression data queryable via ?Anatomy terms/pages
  • Wen will make the model change proposal
  • We may not want to show explicitly in widget
  • There is a need for a condensed display of expression data (per gene)
  • Some datasets, like the EPIC data, explicitly mention each embryonic cell name
  • Need for a condensed ontology browser per gene/anatomy and gene/life stage

Proteomic analysis

  • Encyclopedia of Proteomic Dynamics, contacted Wen to share data
  • Wen will meet/discuss with group soon to determine what the goals are
  • It isn't clear what format the data has
  • Should include Gary Williams on discussions as he already processes Mass Spectrometry data

External Databases

  • To what extent can we take care of the data and display of other lab/publication databases
  • Many authors want to share and make links to their database/website via WormBase
  • What is the best way to handle large scale dataset sharing requests that don't necessarily (for the time being) fit our data model
  • We can take advantage of the "External Links" display on WBPaper pages to link out the the external databases affiliated with the paper, including a link to our FTP site with shared data files, maybe?
  • At least a stop gap measure until we can properly model the data

Cis-regulatory site nomenclature?

  • Barbara Meyer's lab published many "rex" (Recruitment Elements on X) sites, numbered sequentially
  • Tim Schedl wondering about others' thoughts/opinions on how to, possibly, standardize the names of cis-regulatory elements
  • Could be like gene names, without dash, e.g. "rex1", "rex2"
  • We may want to try "WBsf-" prefix, on all element names like "WBsf-rex1", although may be only used in-house

Phenotypes

  • Were there any conclusions about phenotype lookup from the Allele-Phenotype form?
  • Chris spoke with Harald Hutter and others at the meeting about how to improve the lookup for phenotypes
  • Would be good to provide an explicit option to see phenotypes of related (or allele-affiliated) genes, perhaps by shared GO-term annotation
  • Need to think more on how to best compress display of phenotypes on gene pages as well
  • We do already provide links to the Variation and Gene pages (with Phenotypes displayed) in the term information box of the form


July 16, 2015

Anatomy term page expression

  • Raymond and Juancarlos are working on a display of genes that may be exclusively expressed in that anatomy object

Construct/Transgene curation

  • Karen trying to make the curation of constructs & transgenes easier
  • May consider merging the transgene and construct OA's
  • Possibly add a construct/transgene request functionality in other OA's
    • Would those need multiple input fields?
    • Karen would take care of the details

Molecule model

  • Exogenous/endogenous tags issue
  • Scraping data from external chemical databases versus adding biologically relevant data from papers
  • We pull data from, e.g. CHEBI, but not all molecules fall under their purview, e.g. proteins

Micropublication

  • Promotion and outreach
  • Micropublications discoverable in PubMed?
  • Publisher = WormBase? Caltech?
  • Minimal standards for publication?


July 23, 2015

Worm model for autism

  • Would want to take human variations implicated in autism; look for orthologous genes in C. elegans/nematodes and find/make synonymous mutations
  • Prioritize based on worm phenotypes
  • Generally applies to human disease variants

Database Migration

  • Thomas Down leaving WormBase in September
  • Moving ahead with Datomic
  • Good starting use-case for Datomic is querying Datomic-version of GeneACE
  • Need to make sure documentation for migration to Datomic is available and comprehensible
  • Point-people at each site: Sibyl @ OICR, Juancarlos @ Caltech
  • Now need to work out the mechanics of curating into Datomic

WormBase ParaSite

  • Reciprocal searches (WB <-> PS) are working well

Microarray datasets & modSeek

  • Some earlier datasets were re-processed (log-transformed, or re-annotated into original replicates instead of averaged results)
  • Need to try out different methods of processing raw-data (WB usually only takes in processed data)
  • One pipeline can feed data into SPELL and modSeek
  • It's difficult to establish/determine gold standards for assessing process performance

WormBook chapter reviewers

  • Send reviewer suggestions to Paul ASAP

C. elegans proteome in UniProt

  • Not a complete correspondence between WormBase and UniProt
  • Cases: UniProt has entry for a protein that differs by one or two amino acids from WormBase
    • Made from translations of what cDNAs etc. have been submitted
    • Partial data, e.g. partial cDNAs translated
  • Anything we can do to achieve greater consistency?
  • Protein data sets are important
  • Hinxton can use disrepancies as a flag to check on the gene/protein models
  • Would be good to have more reciprocal linkage between UniProt and WormBase
  • AVR-15, UniProt have two additional entries compared to Wormbase, differing in only 1 or 2 amino acids
  • Should we pick up different entries from UniProt and store/display the data; how to reconcile?
  • Possible use case: enter a UniProt ID into the BLAST/BLAT tool to identify WormBase matches

Gene Orienteer Data

  • Sibyl and Xiaodong looking at data and scripts from Gene Orienteer

Precanned queries for exclusive expression

  • Raymond & Juancarlos working on final details
  • Intent is to display genes that may be specifically/exclusively expressed in e.g. an anatomy term

Embryonic developmental timing

  • Sulston, Murray timing data sets for wild type embryonic cell division timing
  • Mutant data sets are coming in as well

Genetic Interaction Ontology (GIO)

  • Latest version of the GIO complete
  • Juancarlos and Chris built a "genetic interaction calculator" to determine interaction types from quantitative phenotype inequalities
  • Sending out to other MODs, etc.
  • Seems that although there is buy in conceptually, most curators can't afford the time for such detailed curation

Phenotype (ontology) display

  • Problems with display of phenotypes (and other annotations) on WormBase, as pointed out by several people at the IWM
  • Karen would like to start creating allele concise descriptions
  • We need compact, intelligently ordered annotation lists, not just alphabetical lists of ontology annotations
  • It would be good to show ancestors for relatedness and order
  • Chris working on Python script to display all annotations in the context of the entire ontology
  • We will need to see if this approach is feasible/beneficial

PATO-style EQ (Entitiy-Quality) phenotype annotations

  • It is clear that some phenotype annotations require details, e.g. "drug sensitivity" annotations should have the drug involved
    • This drug/molecule annotation should be present in the details if not directly in the term itself
  • Raises the issue of a number of cases where we need PATO-style EQ annotations, not just explicit phenotype terms for all possible scenarios
  • This would be helpful in annotating embryonic timing and identity phenotype datasets


July 30, 2015

Wen Chen helped Wen Chen

  • Wen Chen (lab) has list of genes to analyze
  • Wen Chen (WB) helped process the list
  • Would be good to have a simple CGI to process a list of genes in a variety of ways
  • Is this redundant with WormMine?
    • Not for data that doesn't exist (in WormMine) yet; more agile: could be up and running within a matter of days

Interconnections between WormBase and FlyBase

  • We could create more inter-connectivity between the two databases
  • Sharing concise descriptions of genes
  • Would be good for FlyBase and WB curators (Xiaodong?) to talk about where the links should exist at each site

August 6, 2015

WormMine

WormMart machine

  • Wen wants to use the machine when WormMart retires

UniProt/wormbase gene class

  • need to talk to UniProt C.elegans curator

Raymond, Chris and Juancarlos are working on phenotype viewer

James: list of genes, enrich in what tissues

  • python code
  • biotype ontology, tissue expression from postgres as input


August 13, 2015

Phenotype term annotation summary graph

Goal: Provides an ontology-relationship-aware summary view of a gene's phenotype annotations. Prototype link aex-3 (fewer phenotypes) existing phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000086#-b-3> summary graph <http://131.215.12.204/~azurebrd/cgi-bin/amigo.cgi?action=annotSummaryGraph&focusTermId=WBGene00000086>

daf-2 (lots of phenotypes) phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000898#-b-3> summary <http://131.215.12.204/~azurebrd/cgi-bin/amigo.cgi?action=annotSummaryGraph&focusTermId=WBGene00000898>

Proposed development procedure:
  • standalone prototyping, commenting and improvements within the group.
  • implementation as a widget on dev site (juancarlos.wormbase.org), more testing and soliciting comments from selected end users.
  • committing to main site for general use.
Outline of graph processing:

To gather information:

  • WOBr query to collect all phenotypes annotated to the gene of interest.
  • WOBr query to collect all transitive relationships of the phenotypes from (1) towards the ontology root.

To simplify and to control graph size:

  • Remove all nodes (phenotype terms) that are not directly annotated with or at branching points where two branches of annotations merge (LCA lowest common ancestor, if you will).
  • Scale node size according to annotation count (includes inferred annotations).
  • Limit appearance of label to nodes above a given size (roughly big enough to hold term name).
  • Show annotation counts in mouse-over bubble, add hyperlink to term pages to each node

International Biocuration Conference

Propose to submit paper on Community Curation

  • Mary Ann happy to lead.
  • Daniela on board.