Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
Line 156: Line 156:
  
 
=== Microarray datasets & modSeek ===
 
=== Microarray datasets & modSeek ===
* Some earlier datasets need to be re-processed (log-transformed)
+
* Some earlier datasets were re-processed (log-transformed, or re-annotated into original replicates instead of averaged results)
 
* Need to try out different methods of processing raw-data (WB usually only takes in processed data)
 
* Need to try out different methods of processing raw-data (WB usually only takes in processed data)
 
* One pipeline can feed data into SPELL and modSeek
 
* One pipeline can feed data into SPELL and modSeek

Revision as of 18:42, 27 July 2015

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings


2015 Meetings

January

February

March

April

May

June


July 2, 2015

Discussion Topics from IWM

  • Explaining job posting options via the forum in new Worm Breeder's Gazette article
  • Display of CRISPR data
    • Alleles with multiple lesions (one name, many mutations), need to be curated and mapped
  • Ontology term enrichment analysis, using ontologies other than gene ontology
    • Discussed on GO call yesterday; we can/should follow up with Paul Thomas
    • Would be good to have a single central tool/resource for enrichment analysis
    • PantherDB vs DAVID

WormBook Chapters

  • Paul S will review over next week and will provide feedback

Outreach

  • Sending out e-mails to all labs/PIs reminding about new data forms
  • Could also do more personalized outreach to a smaller subset of PI's/labs
    • Could focus on PIs not at the IWM

Anatomy

  • Embryonic development, cell division timing
    • Sulston timing
    • Waterson datasets
    • Zhao cell lineage timing datasets
    • Bao lab?
  • New EM reconstructions from David Hall, Scott Emmons, etc.
  • Neuronal connectivity, collaborative database with Scott Emmons and colleagues

Citace upload

  • Curators submit data to Wen on Tuesday, July 28th

Taking over Gene Orienteer

  • Xiaodong and Sibyl working on

RNASeq data

  • Gary Williams only using high quality data, taking care of all curation (including meta data)
  • Public archive of rejected datasets?

WormGuides

  • Bill Mohler et al working on desktop application


July 9, 2015

Expression Pattern

  • Certain/uncertain qualifiers not annotated before some date
  • ~3,000 ?Expr_pattern objects without that annotation/tag
  • Daniela work on bringing up to date, hopefully won't take long

Expression Clusters to Anatomy & Life Stage annotations

  • Many large scale datasets with tissue-specific expression data
  • Much of what is in SPELL is not annotated to ?Anatomy or ?Life_stage terms
  • A goal: make expression data queryable via ?Anatomy terms/pages
  • Wen will make the model change proposal
  • We may not want to show explicitly in widget
  • There is a need for a condensed display of expression data (per gene)
  • Some datasets, like the EPIC data, explicitly mention each embryonic cell name
  • Need for a condensed ontology browser per gene/anatomy and gene/life stage

Proteomic analysis

  • Encyclopedia of Proteomic Dynamics, contacted Wen to share data
  • Wen will meet/discuss with group soon to determine what the goals are
  • It isn't clear what format the data has
  • Should include Gary Williams on discussions as he already processes Mass Spectrometry data

External Databases

  • To what extent can we take care of the data and display of other lab/publication databases
  • Many authors want to share and make links to their database/website via WormBase
  • What is the best way to handle large scale dataset sharing requests that don't necessarily (for the time being) fit our data model
  • We can take advantage of the "External Links" display on WBPaper pages to link out the the external databases affiliated with the paper, including a link to our FTP site with shared data files, maybe?
  • At least a stop gap measure until we can properly model the data

Cis-regulatory site nomenclature?

  • Barbara Meyer's lab published many "rex" (Recruitment Elements on X) sites, numbered sequentially
  • Tim Schedl wondering about others' thoughts/opinions on how to, possibly, standardize the names of cis-regulatory elements
  • Could be like gene names, without dash, e.g. "rex1", "rex2"
  • We may want to try "WBsf-" prefix, on all element names like "WBsf-rex1", although may be only used in-house

Phenotypes

  • Were there any conclusions about phenotype lookup from the Allele-Phenotype form?
  • Chris spoke with Harald Hutter and others at the meeting about how to improve the lookup for phenotypes
  • Would be good to provide an explicit option to see phenotypes of related (or allele-affiliated) genes, perhaps by shared GO-term annotation
  • Need to think more on how to best compress display of phenotypes on gene pages as well
  • We do already provide links to the Variation and Gene pages (with Phenotypes displayed) in the term information box of the form


July 16, 2015

Anatomy term page expression

  • Raymond and Juancarlos are working on a display of genes that may be exclusively expressed in that anatomy object

Construct/Transgene curation

  • Karen trying to make the curation of constructs & transgenes easier
  • May consider merging the transgene and construct OA's
  • Possibly add a construct/transgene request functionality in other OA's
    • Would those need multiple input fields?
    • Karen would take care of the details

Molecule model

  • Exogenous/endogenous tags issue
  • Scraping data from external chemical databases versus adding biologically relevant data from papers
  • We pull data from, e.g. CHEBI, but not all molecules fall under their purview, e.g. proteins

Micropublication

  • Promotion and outreach
  • Micropublications discoverable in PubMed?
  • Publisher = WormBase? Caltech?
  • Minimal standards for publication?


July 23, 2015

Worm model for autism

  • Would want to take human variations implicated in autism; look for orthologous genes in C. elegans/nematodes and find/make synonymous mutations
  • Prioritize based on worm phenotypes
  • Generally applies to human disease variants

Database Migration

  • Thomas Down leaving WormBase in September
  • Moving ahead with Datomic
  • Good starting use-case for Datomic is querying Datomic-version of GeneACE
  • Need to make sure documentation for migration to Datomic is available and comprehensible
  • Point-people at each site: Sibyl @ OICR, Juancarlos @ Caltech
  • Now need to work out the mechanics of curating into Datomic

WormBase ParaSite

  • Reciprocal searches (WB <-> PS) are working well

Microarray datasets & modSeek

  • Some earlier datasets were re-processed (log-transformed, or re-annotated into original replicates instead of averaged results)
  • Need to try out different methods of processing raw-data (WB usually only takes in processed data)
  • One pipeline can feed data into SPELL and modSeek
  • It's difficult to establish/determine gold standards for assessing process performance

WormBook chapter reviewers

  • Send reviewer suggestions to Paul ASAP

C. elegans proteome in UniProt

  • Not a complete correspondence between WormBase and UniProt
  • Cases: UniProt has entry for a protein that differs by one or two amino acids from WormBase
    • Made from translations of what cDNAs etc. have been submitted
    • Partial data, e.g. partial cDNAs translated
  • Anything we can do to achieve greater consistency?
  • Protein data sets are important
  • Hinxton can use disrepancies as a flag to check on the gene/protein models
  • Would be good to have more reciprocal linkage between UniProt and WormBase
  • AVR-15, UniProt have two additional entries compared to Wormbase, differing in only 1 or 2 amino acids
  • Should we pick up different entries from UniProt and store/display the data; how to reconcile?
  • Possible use case: enter a UniProt ID into the BLAST/BLAT tool to identify WormBase matches

Gene Orienteer Data

  • Sibyl and Xiaodong looking at data and scripts from Gene Orienteer

Precanned queries for exclusive expression

  • Raymond & Juancarlos working on final details
  • Intent is to display genes that may be specifically/exclusively expressed in e.g. an anatomy term

Embryonic developmental timing

  • Sulston, Murray timing data sets for wild type embryonic cell division timing
  • Mutant data sets are coming in as well

Genetic Interaction Ontology (GIO)

  • Latest version of the GIO complete
  • Juancarlos and Chris built a "genetic interaction calculator" to determine interaction types from quantitative phenotype inequalities
  • Sending out to other MODs, etc.
  • Seems that although there is buy in conceptually, most curators can't afford the time for such detailed curation

Phenotype (ontology) display

  • Problems with display of phenotypes (and other annotations) on WormBase, as pointed out by several people at the IWM
  • Karen would like to start creating allele concise descriptions
  • We need compact, intelligently ordered annotation lists, not just alphabetical lists of ontology annotations
  • It would be good to show ancestors for relatedness and order
  • Chris working on Python script to display all annotations in the context of the entire ontology
  • We will need to see if this approach is feasible/beneficial

PATO-style EQ (Entitiy-Quality) phenotype annotations

  • It is clear that some phenotype annotations require details, e.g. "drug sensitivity" annotations should have the drug involved
    • This drug/molecule annotation should be present in the details if not directly in the term itself
  • Raises the issue of a number of cases where we need PATO-style EQ annotations, not just explicit phenotype terms for all possible scenarios
  • This would be helpful in annotating embryonic timing and identity phenotype datasets