Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
Line 29: Line 29:
  
  
== July 9, 2015 ==
 
  
=== Expression Pattern ===
 
* Certain/uncertain qualifiers not annotated before some date
 
* ~3,000 ?Expr_pattern objects without that annotation/tag
 
* Daniela work on bringing up to date, hopefully won't take long
 
 
=== Expression Clusters to Anatomy & Life Stage annotations ===
 
* Many large scale datasets with tissue-specific expression data
 
* Much of what is in SPELL is not annotated to ?Anatomy or ?Life_stage terms
 
* A goal: make expression data queryable via ?Anatomy terms/pages
 
* Wen will make the model change proposal
 
* We may not want to show explicitly in widget
 
* There is a need for a condensed display of expression data (per gene)
 
* Some datasets, like the EPIC data, explicitly mention each embryonic cell name
 
* Need for a condensed ontology browser per gene/anatomy and gene/life stage
 
 
=== Proteomic analysis ===
 
* Encyclopedia of Proteomic Dynamics, contacted Wen to share data
 
* Wen will meet/discuss with group soon to determine what the goals are
 
* It isn't clear what format the data has
 
* Should include Gary Williams on discussions as he already processes Mass Spectrometry data
 
 
=== External Databases ===
 
* To what extent can we take care of the data and display of other lab/publication databases
 
* Many authors want to share and make links to their database/website via WormBase
 
* What is the best way to handle large scale dataset sharing requests that don't necessarily (for the time being) fit our data model
 
* We can take advantage of the "External Links" display on WBPaper pages to link out the the external databases affiliated with the paper, including a link to our FTP site with shared data files, maybe?
 
* At least a stop gap measure until we can properly model the data
 
 
=== Cis-regulatory site nomenclature? ===
 
* Barbara Meyer's lab published many "rex" (Recruitment Elements on X) sites, numbered sequentially
 
* Tim Schedl wondering about others' thoughts/opinions on how to, possibly, standardize the names of cis-regulatory elements
 
* Could be like gene names, without dash, e.g. "rex1", "rex2"
 
* We may want to try "WBsf-" prefix, on all element names like "WBsf-rex1", although may be only used in-house
 
 
=== Phenotypes ===
 
* Were there any conclusions about phenotype lookup from the Allele-Phenotype form?
 
* Chris spoke with Harald Hutter and others at the meeting about how to improve the lookup for phenotypes
 
* Would be good to provide an explicit option to see phenotypes of related (or allele-affiliated) genes, perhaps by shared GO-term annotation
 
* Need to think more on how to best compress display of phenotypes on gene pages as well
 
* We do already provide links to the Variation and Gene pages (with Phenotypes displayed) in the term information box of the form
 
 
 
== July 16, 2015 ==
 
 
=== Anatomy term page expression ===
 
* Raymond and Juancarlos are working on a display of genes that may be exclusively expressed in that anatomy object
 
 
=== Construct/Transgene curation ===
 
* Karen trying to make the curation of constructs & transgenes easier
 
* May consider merging the transgene and construct OA's
 
* Possibly add a construct/transgene request functionality in other OA's
 
** Would those need multiple input fields?
 
** Karen would take care of the details
 
 
=== Molecule model ===
 
* Exogenous/endogenous tags issue
 
* Scraping data from external chemical databases versus adding biologically relevant data from papers
 
* We pull data from, e.g. CHEBI, but not all molecules fall under their purview, e.g. proteins
 
 
=== Micropublication ===
 
* Promotion and outreach
 
* Micropublications discoverable in PubMed?
 
* Publisher = WormBase? Caltech?
 
* Minimal standards for publication?
 
 
 
== July 23, 2015 ==
 
 
=== Worm model for autism ===
 
* Would want to take human variations implicated in autism; look for orthologous genes in C. elegans/nematodes and find/make synonymous mutations
 
* Prioritize based on worm phenotypes
 
* Generally applies to human disease variants
 
 
=== Database Migration ===
 
* Thomas Down leaving WormBase in September
 
* Moving ahead with Datomic
 
* Good starting use-case for Datomic is querying Datomic-version of GeneACE
 
* Need to make sure documentation for migration to Datomic is available and comprehensible
 
* Point-people at each site: Sibyl @ OICR, Juancarlos @ Caltech
 
* Now need to work out the mechanics of curating into Datomic
 
 
=== WormBase ParaSite ===
 
* Reciprocal searches (WB <-> PS) are working well
 
 
=== Microarray datasets & modSeek ===
 
* Some earlier datasets were re-processed (log-transformed, or re-annotated into original replicates instead of averaged results)
 
* Need to try out different methods of processing raw-data (WB usually only takes in processed data)
 
* One pipeline can feed data into SPELL and modSeek
 
* It's difficult to establish/determine gold standards for assessing process performance
 
 
=== WormBook chapter reviewers ===
 
* Send reviewer suggestions to Paul ASAP
 
 
=== C. elegans proteome in UniProt ===
 
* Not a complete correspondence between WormBase and UniProt
 
* Cases: UniProt has entry for a protein that differs by one or two amino acids from WormBase
 
** Made from translations of what cDNAs etc. have been submitted
 
** Partial data, e.g. partial cDNAs translated
 
* Anything we can do to achieve greater consistency?
 
* Protein data sets are important
 
* Hinxton can use disrepancies as a flag to check on the gene/protein models
 
* Would be good to have more reciprocal linkage between UniProt and WormBase
 
* AVR-15, UniProt have two additional entries compared to Wormbase, differing in only 1 or 2 amino acids
 
* Should we pick up different entries from UniProt and store/display the data; how to reconcile?
 
* Possible use case: enter a UniProt ID into the BLAST/BLAT tool to identify WormBase matches
 
 
=== Gene Orienteer Data ===
 
* Sibyl and Xiaodong looking at data and scripts from Gene Orienteer
 
 
=== Precanned queries for exclusive expression ===
 
* Raymond & Juancarlos working on final details
 
* Intent is to display genes that may be specifically/exclusively expressed in e.g. an anatomy term
 
 
=== Embryonic developmental timing ===
 
* Sulston, Murray timing data sets for wild type embryonic cell division timing
 
* Mutant data sets are coming in as well
 
 
=== Genetic Interaction Ontology (GIO) ===
 
* Latest version of the GIO complete
 
* Juancarlos and Chris built a "genetic interaction calculator" to determine interaction types from quantitative phenotype inequalities
 
** http://mangolassi.caltech.edu/~azurebrd/cgi-bin/forms/gi_calculator.cgi
 
* Sending out to other MODs, etc.
 
* Seems that although there is buy in conceptually, most curators can't afford the time for such detailed curation
 
 
=== Phenotype (ontology) display ===
 
* Problems with display of phenotypes (and other annotations) on WormBase, as pointed out by several people at the IWM
 
* Karen would like to start creating allele concise descriptions
 
* We need compact, intelligently ordered annotation lists, not just alphabetical lists of ontology annotations
 
* It would be good to show ancestors for relatedness and order
 
* Chris working on Python script to display all annotations in the context of the entire ontology
 
* We will need to see if this approach is feasible/beneficial
 
 
=== PATO-style EQ (Entitiy-Quality) phenotype annotations ===
 
* It is clear that some phenotype annotations require details, e.g. "drug sensitivity" annotations should have the drug involved
 
** This drug/molecule annotation should be present in the details if not directly in the term itself
 
* Raises the issue of a number of cases where we need PATO-style EQ annotations, not just explicit phenotype terms for all possible scenarios
 
* This would be helpful in annotating embryonic timing and identity phenotype datasets
 
 
 
== July 30, 2015 ==
 
 
=== Wen Chen helped Wen Chen ===
 
* Wen Chen (lab) has list of genes to analyze
 
* Wen Chen (WB) helped process the list
 
* Would be good to have a simple CGI to process a list of genes in a variety of ways
 
** http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/fraqmine.cgi
 
** For GeneTissueLifeStage and GeneConciseDescription more datatypes easily slotted in if curator makes a file
 
* Is this redundant with WormMine?
 
** Not for data that doesn't exist (in WormMine) yet; more agile: could be up and running within a matter of days
 
 
=== Interconnections between WormBase and FlyBase ===
 
* We could create more inter-connectivity between the two databases
 
* Sharing concise descriptions of genes
 
* Would be good for FlyBase and WB curators (Xiaodong?) to talk about where the links should exist at each site
 
  
 
== August 6, 2015 ==
 
== August 6, 2015 ==

Revision as of 17:10, 13 August 2015

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings


2015 Meetings

January

February

March

April

May

June

July



August 6, 2015

WormMine

WormMart machine

  • Wen wants to use the machine when WormMart retires

UniProt/wormbase gene class

  • need to talk to UniProt C.elegans curator

Raymond, Chris and Juancarlos are working on phenotype viewer

James: list of genes, enrich in what tissues

  • python code
  • biotype ontology, tissue expression from postgres as input


August 13, 2015

Phenotype term annotation summary graph

Goal: Provides an ontology-relationship-aware summary view of a gene's phenotype annotations. Prototype link aex-3 (fewer phenotypes) existing phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000086#-b-3> summary graph <http://131.215.12.204/~azurebrd/cgi-bin/amigo.cgi?action=annotSummaryGraph&focusTermId=WBGene00000086>

daf-2 (lots of phenotypes) phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000898#-b-3> summary <http://131.215.12.204/~azurebrd/cgi-bin/amigo.cgi?action=annotSummaryGraph&focusTermId=WBGene00000898>

Proposed development procedure:
  • standalone prototyping, commenting and improvements within the group.
  • implementation as a widget on dev site (juancarlos.wormbase.org), more testing and soliciting comments from selected end users.
  • committing to main site for general use.
Outline of graph processing:

To gather information:

  • WOBr query to collect all phenotypes annotated to the gene of interest.
  • WOBr query to collect all transitive relationships of the phenotypes from (1) towards the ontology root.

To simplify and to control graph size:

  • Remove all nodes (phenotype terms) that are not directly annotated with or at branching points where two branches of annotations merge (LCA lowest common ancestor, if you will).
  • Scale node size according to annotation count (includes inferred annotations).
  • Limit appearance of label to nodes above a given size (roughly big enough to hold term name).
  • Show annotation counts in mouse-over bubble, add hyperlink to term pages to each node

International Biocuration Conference

Propose to submit paper on Community Curation

  • Mary Ann happy to lead.
  • Daniela on board.