Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
Line 21: Line 21:
 
[[WormBase-Caltech_Weekly_Calls_June_2013|June]]
 
[[WormBase-Caltech_Weekly_Calls_June_2013|June]]
  
 +
[[WormBase-Caltech_Weekly_Calls_July_2013|July]]
  
= July 11, 2013 =
 
  
 
+
= August 1, 2013 =
Geneace daily dump
 
*EBI is moving nameserver location
 
*Getting real-time updates of gene list for genes in OA
 
*Michael Paulini set up nightly geneace dumps to FTP site
 
*We have gene file from nameserver: cgc name, public names, sequence name, live/dead status, gene IDs
 
*What data do we want additionally? Synonyms?
 
 
 
 
 
Spica has officially been moved to new machine
 
*Let Raymond know of any problems
 
*Would be good to track all accounts on Spica (and any other machine)
 
**Can use log of all user logins
 
 
 
 
 
AMIGO 2 still moving forward
 
*AMIGO 2 might go live July 17th
 
*We should be able to start configuring for WormBase at that point
 
 
 
 
 
Process Pages/WikiPathways
 
*iFrame window doesn't work/load on Firefox; they are working on it
 
*iFrame window interactive display somewhat problematic
 
*Discussing Cytoscape as alternative?
 
*Using Cytoscape to display pathways would require significant development
 
*Some app available to load GPML from WikiPathways into Cytoscape, but JD couldn't get it working (yet)
 
*Having all process-related interactions in an Interactions widget on Process Pages
 
**Users need a clearer legend explaining what the different edges mean
 
**We need to modify some edges (e.g. flat ends do not mean repression; maybe they should)
 
 
 
 
 
Author First Pass Forms
 
*Currently we collect data from authors that we may not have intention of curating (at least right away)
 
*We can provide a disclaimer on the letter to authors explaining that some data may not be curated immediately
 
*All data is catalogued
 
 
 
 
 
Sequence Feature curation
 
*Xiaodong met with Gary Williams and Mary Ann Tuli at IWM
 
*Enhancer curation?
 
*Significant backlog on sequence feature curation
 
*Margie Ho asking about curated enhancers, regulatory regions
 
*Margie has 30 papers with highly annotated regulatory regions
 
*Gary W. is prioritizing curation of these now
 
*Gary will propose appropriate model changes (e.g. Add "silencer" and "enhancer" to method for GBrowse display)
 
 
 
 
 
User use case: All G-protein coupled receptors expressed in AWC neurons
 
*Quantitative expression data is not effectively linked to anatomy terms
 
*Wen will propose model fix to accommodate this association
 
*Genes expressed above some pre-defined threshold will be associated with a cell
 
*For example, Ping (in Paul's lab) will be performing AWC single cell profling
 
*AWA and BAG neurons has been profiled by tiling arrays
 
*Male linker cell by RNA-Seq (Erich and Mihoko)
 
 
 
 
 
Curation strategies
 
*Change our paper-by-paper curation
 
*We may be able to make use of a Textpresso categorization program to tag papers
 
*Caltech curators can then prioritize their curation based on a particular category or topic
 
*We can look at the [http://www.biomedcentral.com/1471-2105/7/370 Textpresso paper] and reconvene next week to discuss
 
 
 
 
 
 
 
= July 18, 2013 =
 
 
 
 
 
Textpresso Paper categorization
 
*Prioritization of papers based on: 1) SVM-Textpresso script categorization, and 2) Ideal prioritization scheme according to curation status
 
*How does this tie into our grant quarterly progress report?
 
*Can we create a putative milestone to achieve for the WS240 upload?
 
*How do we consider backlog size wrt priorities and categorization?
 
*Even if a data-type backlog is small, it would be worth going back to older curation to check for accuracy and consistency
 
*Will this pipeline be more efficient? We should define metrics to measure curation effectiveness/efficiency
 
**Compare curation statistics of new pipeline to last year or two of curation statistics
 
*Yuling can run existing SVM pipeline on corpus (supervised learning); unsupervised learning will require more human effort
 
*We can provide lists of keywords to improve the categorization
 
*There are 1750 papers with author first pass responses, Juancarlos emailed the paperIDs with timestamp of response
 
 
 
 
 
Upload
 
*Next upload (WS240) deadline will be last Friday in September (Sept 27th, 2013)
 
 
 
 
 
ACEDB, Citace Minus
 
*We will remove write-access from citace, moving personal files to citpub for write-access
 
*Wen will send out a summary e-mail
 
*Raymond can/will create individual user accounts (for those who want it) with access to personal versions of CitaceMinus and WS
 
**Personal versions of CitaceMinus and WS will be write-accessible
 
**Write e-mail to Raymond to request an account on Spica
 
 
 
 
 
Nightly GeneAce dumps
 
*What data from nameserver do we want to pull nightly?
 
*We need schema and existing data from Michael Paulini
 
*Until curators (Kimberly ?) tell Juancarlos what we want to extract, we're keeping the scripts that get data from the nameserver.
 
 
 
 
 
= July 25, 2013 =
 
 
 
Ontology Browsers
 
* We are currently testing and developing AMIGO2 for integration into WormBase
 
** most recent version at http://mangolassi.caltech.edu/~azurebrd/cgi-bin/testing/amigo/wobr/amigo.cgi
 
* Example Browsers:
 
** [http://www.ebi.ac.uk/ontology-lookup/ Ontology Lookup Service (OLS)] from EBI
 
** [http://www.sequenceontology.org/browser/obob.cgi MISO] (Sequence Ontology Browser from www.sequenceontology.org)
 
** OBO-Edit
 
** Protege
 
* We are trying to decide on what would be an optimal ontology browsing experience
 
* Browser experience features to consider:
 
** Directed Acyclic Graph (DAG) view
 
*** Good for visualizing "Path to Root" relationships of an ontology term
 
** Viewing children, parents, and/or siblings
 
*** Interactive expanding/collapsing of terms
 
*** Static tree or table views
 
*** "Inferred Tree View" - a compressed path to root tree view (in text format)
 
** Clicking to open a new web page versus interactive browsing without reloads
 
* Consensus is that interactive expandable and collapsible nodes would be ideal
 
* We will use the Gene Ontology as a pilot ontology to first introduce/integrate into the WormBase site; other ontologies can come later
 
* We will provide link outs to AMIGO and related sites/services to take advantage of their data and tools
 
 
 
 
 
Sequence features related to expression patterns and gene regulation interactions
 
* Add a "Transcription factor binding" tag to Expr_pattern model? Not necessary?
 
* Already captured in gene regulation interactions?
 
* We need to discuss (site-wide) what model changes (if any) would be required to adequately capture this information
 
* Gary Williams working on curating sequence features to link appropriately to Expr_pattern, Interaction (regulatory), and (maybe) Transgenes
 
 
 
 
 
Author First Pass paper word frequency analysis
 
* Yuling performed word frequency analysis of whole papers and now sections
 
* Karen took the "Titles" analysis, filtered out words with less than 10 hits, highlighted potential keywords
 
* How should we go about choosing a topic?
 
* We will choose AFP papers with "stress" in the title and assess curation status of each paper for our individual data types
 
** WBPaper00031692
 
** WBPaper00031694
 
** WBPaper00031842
 
** WBPaper00031873
 
** WBPaper00032236
 
** WBPaper00032241
 
** WBPaper00033114
 
** WBPaper00032321
 
** WBPaper00034757
 
** WBPaper00035114
 
** WBPaper00036083
 
** WBPaper00036413
 
** WBPaper00036090
 
** WBPaper00036135
 
** WBPaper00035965
 
** WBPaper00037147
 
** WBPaper00037595
 
** WBPaper00037886
 
** WBPaper00038233
 
** WBPaper00039783
 
** WBPaper00039990
 
** WBPaper00038093
 
** WBPaper00040006
 
** WBPaper00039878
 
** WBPaper00039835
 
** WBPaper00040166
 
** WBPaper00039788
 
** WBPaper00040384
 
** WBPaper00038464
 
** WBPaper00040697
 
** WBPaper00040849
 
** WBPaper00041075
 
** WBPaper00040133
 
** WBPaper00040902
 
** WBPaper00041277
 
** WBPaper00041295
 
** WBPaper00041568
 
** WBPaper00041528
 
** WBPaper00041150
 
** WBPaper00041610
 
** WBPaper00041663
 
** WBPaper00041866
 
** WBPaper00042148
 
** WBPaper00042067
 
** WBPaper00042178
 

Revision as of 15:37, 1 August 2013