Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
Line 29: Line 29:
== July 9, 2015 ==
=== Expression Pattern ===
* Certain/uncertain qualifiers not annotated before some date
* ~3,000 ?Expr_pattern objects without that annotation/tag
* Daniela work on bringing up to date, hopefully won't take long
=== Expression Clusters to Anatomy & Life Stage annotations ===
* Many large scale datasets with tissue-specific expression data
* Much of what is in SPELL is not annotated to ?Anatomy or ?Life_stage terms
* A goal: make expression data queryable via ?Anatomy terms/pages
* Wen will make the model change proposal
* We may not want to show explicitly in widget
* There is a need for a condensed display of expression data (per gene)
* Some datasets, like the EPIC data, explicitly mention each embryonic cell name
* Need for a condensed ontology browser per gene/anatomy and gene/life stage
=== Proteomic analysis ===
* Encyclopedia of Proteomic Dynamics, contacted Wen to share data
* Wen will meet/discuss with group soon to determine what the goals are
* It isn't clear what format the data has
* Should include Gary Williams on discussions as he already processes Mass Spectrometry data
=== External Databases ===
* To what extent can we take care of the data and display of other lab/publication databases
* Many authors want to share and make links to their database/website via WormBase
* What is the best way to handle large scale dataset sharing requests that don't necessarily (for the time being) fit our data model
* We can take advantage of the "External Links" display on WBPaper pages to link out the the external databases affiliated with the paper, including a link to our FTP site with shared data files, maybe?
* At least a stop gap measure until we can properly model the data
=== Cis-regulatory site nomenclature? ===
* Barbara Meyer's lab published many "rex" (Recruitment Elements on X) sites, numbered sequentially
* Tim Schedl wondering about others' thoughts/opinions on how to, possibly, standardize the names of cis-regulatory elements
* Could be like gene names, without dash, e.g. "rex1", "rex2"
* We may want to try "WBsf-" prefix, on all element names like "WBsf-rex1", although may be only used in-house
=== Phenotypes ===
* Were there any conclusions about phenotype lookup from the Allele-Phenotype form?
* Chris spoke with Harald Hutter and others at the meeting about how to improve the lookup for phenotypes
* Would be good to provide an explicit option to see phenotypes of related (or allele-affiliated) genes, perhaps by shared GO-term annotation
* Need to think more on how to best compress display of phenotypes on gene pages as well
* We do already provide links to the Variation and Gene pages (with Phenotypes displayed) in the term information box of the form
== July 16, 2015 ==
=== Anatomy term page expression ===
* Raymond and Juancarlos are working on a display of genes that may be exclusively expressed in that anatomy object
=== Construct/Transgene curation ===
* Karen trying to make the curation of constructs & transgenes easier
* May consider merging the transgene and construct OA's
* Possibly add a construct/transgene request functionality in other OA's
** Would those need multiple input fields?
** Karen would take care of the details
=== Molecule model ===
* Exogenous/endogenous tags issue
* Scraping data from external chemical databases versus adding biologically relevant data from papers
* We pull data from, e.g. CHEBI, but not all molecules fall under their purview, e.g. proteins
=== Micropublication ===
* Promotion and outreach
* Micropublications discoverable in PubMed?
* Publisher = WormBase? Caltech?
* Minimal standards for publication?
== July 23, 2015 ==
=== Worm model for autism ===
* Would want to take human variations implicated in autism; look for orthologous genes in C. elegans/nematodes and find/make synonymous mutations
* Prioritize based on worm phenotypes
* Generally applies to human disease variants
=== Database Migration ===
* Thomas Down leaving WormBase in September
* Moving ahead with Datomic
* Good starting use-case for Datomic is querying Datomic-version of GeneACE
* Need to make sure documentation for migration to Datomic is available and comprehensible
* Point-people at each site: Sibyl @ OICR, Juancarlos @ Caltech
* Now need to work out the mechanics of curating into Datomic
=== WormBase ParaSite ===
* Reciprocal searches (WB <-> PS) are working well
=== Microarray datasets & modSeek ===
* Some earlier datasets were re-processed (log-transformed, or re-annotated into original replicates instead of averaged results)
* Need to try out different methods of processing raw-data (WB usually only takes in processed data)
* One pipeline can feed data into SPELL and modSeek
* It's difficult to establish/determine gold standards for assessing process performance
=== WormBook chapter reviewers ===
* Send reviewer suggestions to Paul ASAP
=== C. elegans proteome in UniProt ===
* Not a complete correspondence between WormBase and UniProt
* Cases: UniProt has entry for a protein that differs by one or two amino acids from WormBase
** Made from translations of what cDNAs etc. have been submitted
** Partial data, e.g. partial cDNAs translated
* Anything we can do to achieve greater consistency?
* Protein data sets are important
* Hinxton can use disrepancies as a flag to check on the gene/protein models
* Would be good to have more reciprocal linkage between UniProt and WormBase
* AVR-15, UniProt have two additional entries compared to Wormbase, differing in only 1 or 2 amino acids
* Should we pick up different entries from UniProt and store/display the data; how to reconcile?
* Possible use case: enter a UniProt ID into the BLAST/BLAT tool to identify WormBase matches
=== Gene Orienteer Data ===
* Sibyl and Xiaodong looking at data and scripts from Gene Orienteer
=== Precanned queries for exclusive expression ===
* Raymond & Juancarlos working on final details
* Intent is to display genes that may be specifically/exclusively expressed in e.g. an anatomy term
=== Embryonic developmental timing ===
* Sulston, Murray timing data sets for wild type embryonic cell division timing
* Mutant data sets are coming in as well
=== Genetic Interaction Ontology (GIO) ===
* Latest version of the GIO complete
* Juancarlos and Chris built a "genetic interaction calculator" to determine interaction types from quantitative phenotype inequalities
** http://mangolassi.caltech.edu/~azurebrd/cgi-bin/forms/gi_calculator.cgi
* Sending out to other MODs, etc.
* Seems that although there is buy in conceptually, most curators can't afford the time for such detailed curation
=== Phenotype (ontology) display ===
* Problems with display of phenotypes (and other annotations) on WormBase, as pointed out by several people at the IWM
* Karen would like to start creating allele concise descriptions
* We need compact, intelligently ordered annotation lists, not just alphabetical lists of ontology annotations
* It would be good to show ancestors for relatedness and order
* Chris working on Python script to display all annotations in the context of the entire ontology
* We will need to see if this approach is feasible/beneficial
=== PATO-style EQ (Entitiy-Quality) phenotype annotations ===
* It is clear that some phenotype annotations require details, e.g. "drug sensitivity" annotations should have the drug involved
** This drug/molecule annotation should be present in the details if not directly in the term itself
* Raises the issue of a number of cases where we need PATO-style EQ annotations, not just explicit phenotype terms for all possible scenarios
* This would be helpful in annotating embryonic timing and identity phenotype datasets
== July 30, 2015 ==
=== Wen Chen helped Wen Chen ===
* Wen Chen (lab) has list of genes to analyze
* Wen Chen (WB) helped process the list
* Would be good to have a simple CGI to process a list of genes in a variety of ways
** http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/fraqmine.cgi
** For GeneTissueLifeStage and GeneConciseDescription more datatypes easily slotted in if curator makes a file
* Is this redundant with WormMine?
** Not for data that doesn't exist (in WormMine) yet; more agile: could be up and running within a matter of days
=== Interconnections between WormBase and FlyBase ===
* We could create more inter-connectivity between the two databases
* Sharing concise descriptions of genes
* Would be good for FlyBase and WB curators (Xiaodong?) to talk about where the links should exist at each site
== August 6, 2015 ==
== August 6, 2015 ==

Revision as of 17:10, 13 August 2015

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings








August 6, 2015


WormMart machine

  • Wen wants to use the machine when WormMart retires

UniProt/wormbase gene class

  • need to talk to UniProt C.elegans curator

Raymond, Chris and Juancarlos are working on phenotype viewer

James: list of genes, enrich in what tissues

  • python code
  • biotype ontology, tissue expression from postgres as input

August 13, 2015

Phenotype term annotation summary graph

Goal: Provides an ontology-relationship-aware summary view of a gene's phenotype annotations. Prototype link aex-3 (fewer phenotypes) existing phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000086#-b-3> summary graph <>

daf-2 (lots of phenotypes) phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000898#-b-3> summary <>

Proposed development procedure:
  • standalone prototyping, commenting and improvements within the group.
  • implementation as a widget on dev site (juancarlos.wormbase.org), more testing and soliciting comments from selected end users.
  • committing to main site for general use.
Outline of graph processing:

To gather information:

  • WOBr query to collect all phenotypes annotated to the gene of interest.
  • WOBr query to collect all transitive relationships of the phenotypes from (1) towards the ontology root.

To simplify and to control graph size:

  • Remove all nodes (phenotype terms) that are not directly annotated with or at branching points where two branches of annotations merge (LCA lowest common ancestor, if you will).
  • Scale node size according to annotation count (includes inferred annotations).
  • Limit appearance of label to nodes above a given size (roughly big enough to hold term name).
  • Show annotation counts in mouse-over bubble, add hyperlink to term pages to each node

International Biocuration Conference

Propose to submit paper on Community Curation

  • Mary Ann happy to lead.
  • Daniela on board.