Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
Line 6: Line 6:
  
  
= January 10, 2013 =
+
2013 Meetings
  
  
Site-of-action pages
+
[[WormBase-Caltech_Weekly_Calls_January_2013|January]]
*No cell function pages exist
 
*Tried to display cell function info on anatomy pages
 
*Trying to display site-of-action data on relevant gene pages
 
 
 
 
 
Process pages
 
*Rudimentary process page up
 
*Has overview widget with all relevant entities
 
*Should be something by SAB
 
 
 
 
 
Intermine
 
*JD is working on; should have something ready by SAB
 
*Will replace WormMart entirely
 
*We should make an announcement on the main WormBase page that WormMart will remain at WS220 and Intermine will replace it
 
*We can perform queries for users in the meantime, if necessary
 
*Intermine interface isn't very intuitive; maybe we can improve it in time
 
*Can build queries with a query builder and perform complex queries
 
*Precanned queries could be made and proposed to users
 
*Curators can try YeastMine or FlyMine to try it out
 
 
 
 
 
Species expression data from Itai Yanai
 
*Importing pictures; potentially flagging each to a particular species
 
*Pictures typically linked to Expr_pattern objects
 
*Can images be linked to sequences, rather than gene objects?
 
 
 
 
 
Documentation for Protein-2-GO tool
 
 
 
 
 
Brugia database
 
*Mark Blaxter offered to donate his Brugia database into WormBase
 
*Papers from 1600s!!!
 
 
 
 
 
Nematode Textpresso
 
*~10,000 new papers available (open access?)
 
 
 
 
 
Species in Phenotype curation
 
*Species tag in phenotype
 
*Default dumping with species as C. elegans; some data was removed during the dump
 
*Do phenotypes need a species tag? We should add species tag to everything, to be safe
 
*Do we create a separate Citace for each species (potentially 100s)?
 
*Dump all data for all species in one file for testing, and individual files for upload?
 
*Should discuss with Kevin Howe
 
 
 
 
 
Upload Stats
 
*Wen will process upload statistics as curators submit data
 
*Curators can check for discrepancies right away, rather than wait for the build and submission to notice errors
 
 
 
 
 
Legacy info complete!
 
 
 
 
 
Curation Status Form
 
*Will be live (on Tazendra) once some SVM reruns are done
 
 
 
 
 
INDI (Interesting, Not-yet-modeled Data Index)
 
*Wiki page generated to capture data types that do not fit cleanly into current data models
 
*http://wiki.wormbase.org/index.php/INDI
 
 
 
 
 
Community Annotation of Concise Descriptions, WikiPathways
 
*Concise Descriptions
 
**Ranjana wrote up a template; Kimberly and Ranjana were stress testing the form
 
**Goal: Useful to WB curators, specific community members (experts) for trial curation, and finally general users
 
**Scripting simple descriptions, manual annotation of complex descriptions
 
**Prioritizing genes that have no descriptions
 
**Form should allow a user comment box to indicate data that is missing from WormBase (must have reference or data to support)
 
*WikiPathways
 
**How do we engage the community in WikiPathways? Prepopulating pages with lists of genes?
 
**WormBase-approved models vs. community suggestions/ideas
 
*We could focus on specific data types (for this year, say) and really push a public agenda to get community annotation rolling
 
 
 
 
 
SAB Meeting
 
*Few weeks away
 
*What do we present?
 
*We want to discuss our process: curation pipeline, Curation Status Tool
 
*Human disease relevance
 
*Generate consistent theme or topic flow; divy up to people to present
 
*How do we engage the community? What tools do we develop? How will they work? Educational outreach?
 
*Display
 
**How do we display transcription regulatory networks (TRNs)?
 
**How do we display/capture pathways?
 
*Topics: Multiple genomes, Natural variants, Transcriptomics, TRNs, Community Annotation, Pathways, Human relevance, Drugs/molecules
 
 
 
 
 
 
 
= January 17, 2013 =
 
 
 
 
 
Curation Status Form
 
*Expression data
 
**Daniela going through and validating negative papers
 
**No record of truly negative papers during Andrei's first pass
 
**~2000 papers flagged positive for other_expr but not curated
 
**Papers up to year 2000 were carefully, manually scored for expression data
 
*We can present the form at SAB meeting
 
*Transgene, antibody, and disease are coming from Textpresso, not SVM; papers will be flagged through Curator First Pass tables
 
*Microarray datatype to be added to form
 
**GEO data with no reference? Ignore
 
*Need to add a "Positive but not curatable" pre-canned comment
 
 
 
 
 
SAB Meeting agenda
 
*Chris' will present flagging and curation pipeline overview
 
*Human relevance
 
**Diseases, disease ontology
 
*"Spontaneous" updates to DB/website? Mirrors?
 
*Processes and pathways
 
*Transcription Regulatory Networks
 
**Show Cytoscape view of regulatory interactions
 
**Show GBrowse with PWMs, for example
 
**Cytoscape cell lineage?
 
**BioTapestry
 
**Cytoscape network filters? (temporal, spatial, process)
 
*OICR and Hinxton will present first
 
*Kimberly can present on LEGO
 
*GO curation pipeline
 
*Future plans
 
*Community Annotation
 
*Meeting at Millikan boardroom
 
*SAB Monday night dinner @ Morgan Library
 
*Sunday meeting
 
**Discuss Cytoscape options with OICR team
 
 
 
 
 
Gene Disaese OA
 
*Ranjana and Juancarlos developed OA for gene/disease relationships
 
*Connections to Reactome via human genes
 
 
 
 
 
Cori Bargmann
 
*Giving talk this Tuesday 4pm
 
*She'll meet WormBase Wednesday morning ~9am-10:30am
 
 
 
 
 
Prioritization of Allele/Phenotype curation
 
*Lack of data
 
*Process-based curation
 
*Top priority: phenotypes for genes that have no existing data
 
*Consider throughput of data sets (large scale may be less reliable/granular)
 
 
 
 
 
Community Concise Descriptions
 
*Focus on C. briggsae and Brugia malayi
 
 
 
 
 
Dead genes
 
*Plan is to maintain record of original genes (dead or not) referred to by objects in Postgres (and other curation databases)
 
*At dump, genes will be recognized as dead and replaced where applicable
 
*ACEBD and website will display only updated genes but have a "Historical_gene" tag and a remark indicating original reference to a dead gene
 
 
 
 
 
 
 
= January 24, 2013 =
 
 
 
SAB Meeting
 
*Curation pipeline
 
**Flagging methods
 
***SVM
 
***Author First Pass
 
***Curator First Pass
 
***Textpresso
 
**Curation progress over time
 
***Where are we? How much do we have to do? What's our rate?
 
***Do we reprioritize curation? Deprecate some curation?
 
***Group data types by phenotype related classes
 
***Compare current numbers to WS220 and WS200
 
***Numbers by object # versus paper #
 
***Large-scale vs small-scale data sets
 
***High vs low numbers
 
 
 
*Transcription Regulatory Networks and Gene Expression
 
**modENCODE data in GBrowse (ChIP-Seq, RNA-Seq)
 
**Cytoscape view - can filter for regulatory interactions
 
**FlyBase TRN data
 
**BioTapestry
 
**What questions can we ask? What do we want to ask?
 
**How much do we need to develop tools? Can we co-opt what has already been developed?
 
**Transgenomics
 
 
 
*Interactions
 
**New model, consolidating interaction types
 
**New genetic interaction ontology with BioGrid, SGD, etc.
 
**New OA
 
**Will get data from and share with BioGrid
 
**Genetic interactions done in parallel with phenotype curation
 
**Regulatory interactions done by Xiaodong
 
**Large-scale predicted interaction datasets
 
 
 
 
 
*Focus on processes and pathways
 
**Process pages and Wikipathways
 
**LEGO vs Wikipathways: Redundant? How are they complementary?
 
 
 
 
 
*Community Annotation
 
 
 
*Curation Priority & Efficiency
 
**Curation by paper, data type, topic/process?
 
**Should we pilot different curation pipelines?
 
**What data types are more easily curated in parallel (e.g. phenotype and genetic interactions)?
 
**Hybrid approach: spend some time acquiring specific (and related) data types, then switch to some other curation focus
 
**Week focused on particular topic; curator jamboree
 
**Experiment with different approaches: What's most fun? Efficient? Productive? Satisfying?
 
 
 
 
 
 
 
= January 31, 2013 =
 
 
 
SAB Review
 
*Blog
 
**Use blog more; user advice like: how to open all a page's widgets
 
**Coming soon info: new data, new displays
 
***Maybe just focus on existing content for now
 
**We now have XXX number of XXX objects (Expr pattern, RNAi, phenotypes, etc.)
 
**Distributing tasks/efforts
 
**WormBase You Tube Channel
 
**Recycling notices of features: Make sure all features that we make notice of
 
**Report broken things and when things are fixed
 
**Need a concise, comprehensive Table of Contents for the site; Site Map to help users know about data and features
 
**How is WBPerson class growing/changing?
 
*Bug Fixes
 
**Prioritizing bugs to be fixed
 
**Everyone checks the staging site for a full day, once per release or month
 
**Don't show features that are broken on site
 
*Intermine priority
 
**What do we need to do to get it running ASAP?
 
**What is the minimal function required for going live?
 
**Start with gene class, see how much we can get ready
 
**June 2013 release?
 
**Work with existing capabilities of Intermine, not holding out on new query features/capabilities
 
**Bottom-up vs top-down approach to development?
 
**What are the biological priorities for Intermine?
 
**Integrate one class at a time, starting with most important data
 
**Get a freeze ready by April 2013, then start testing cycles and make tutorials
 
*Incremental updates
 
**Easiest thing if we (Caltech) make pages here and send of to OICR to place in widget
 
**Curators should decide what type of data they would like to see in incremental updates and discuss with Juancarlos
 
**Concise description updates; how to handle typos?
 
**Minor vs Major changes? Let's not distinguish right now
 
**New data widget open by default on pages?
 
*Release notes
 
**Need to send Hinxton a list of items (numbers) to be reported on release notes
 
**Chris will compile a list of putative release stats to report and send around
 
*Resequencing strain issues
 
**Curating phenotypes to strains vs. alleles, because of potential background mutations
 
**Linking phenotypes to strain, but VIA a presumed allele
 
*Pathways - who is an expert? Can they help?
 
**Gene pre-populating in Wiki Pathways
 
*Transcription Regulatory Networks
 
**Can we have FlyBase-like GBrowse tracks?
 
**When do we say a Transcription Factor is regulating a target gene? Corroborating evidence
 
*Paper Annotation Tool
 
**Make tool available at the PubMed site?
 
*Expression page revamp
 
**RNA-seq graphs, microarray graphs
 
**Displaying genotype of transgenes on expression page
 
**Displaying all anatomy terms associated with all Expr patterns
 
**Yanai data as graphs/images
 
**Expression clusters: re-annotating now, working on display and querying
 
*Brief IDs vs. concise descriptions
 
 
 
 
 
Table of Contents for Future Discussion
 
*Strain-to-Phenotype curation
 
*Intermine
 
*Incremental updates
 
*Bug fix prioritization
 
*Pathways
 
*Brief IDs vs. concise descriptions
 
*LEGO
 
*Community Annotation
 
*Phenotype-based curation; jamboree-style curation; Flagging papers by topic/process
 
*Phenotype-to-GO issues
 
*Quarterly update reports
 
*Relational DB vs ACEDB
 
*Hiring more developers?
 
  
  

Revision as of 18:04, 7 February 2013

2009 Meetings

2011 Meetings

2012 Meetings


2013 Meetings


January


February 7, 2013

Person Cytoscape

  • Person-person network viewing on website
  • Use person lineages
  • Will discuss with web team


WormBook History

  • Paul editing a person history section, starting with John White
  • Can use history info to link people


Binary vs. Non-binary interaction display

  • Cytoscape and interaction table are only displaying interactions as binary interactions, regardless of number of interactors
  • We need to assess how many interactions we have that have more than 2 interactors
  • Once we have assessed that, we should discuss with web team about how to display these cases


Curation Statistics

  • Wen can generate numbers for Citace AND the complete build (rather than Hinxton)?
  • Will let Hinxton build the queries
  • End-of-build summary for curators (with more database technical numbers) separate from the Release Notes for public (with more biologically interesting numbers)
  • Will separate Wiki page into two sections to reflect this


Postgres info table (on Wiki page)

  • Curators will continue to add info to Wiki on scripts, dependencies, ontologies, etc. for each Ontology Annotator