WormBase-Caltech Weekly Calls January 2013

From WormBaseWiki
Jump to navigationJump to search

January 10, 2013

Site-of-action pages

  • No cell function pages exist
  • Tried to display cell function info on anatomy pages
  • Trying to display site-of-action data on relevant gene pages

Process pages

  • Rudimentary process page up
  • Has overview widget with all relevant entities
  • Should be something by SAB


  • JD is working on; should have something ready by SAB
  • Will replace WormMart entirely
  • We should make an announcement on the main WormBase page that WormMart will remain at WS220 and Intermine will replace it
  • We can perform queries for users in the meantime, if necessary
  • Intermine interface isn't very intuitive; maybe we can improve it in time
  • Can build queries with a query builder and perform complex queries
  • Precanned queries could be made and proposed to users
  • Curators can try YeastMine or FlyMine to try it out

Species expression data from Itai Yanai

  • Importing pictures; potentially flagging each to a particular species
  • Pictures typically linked to Expr_pattern objects
  • Can images be linked to sequences, rather than gene objects?

Documentation for Protein-2-GO tool

Brugia database

  • Mark Blaxter offered to donate his Brugia database into WormBase
  • Papers from 1600s!!!

Nematode Textpresso

  • ~10,000 new papers available (open access?)

Species in Phenotype curation

  • Species tag in phenotype
  • Default dumping with species as C. elegans; some data was removed during the dump
  • Do phenotypes need a species tag? We should add species tag to everything, to be safe
  • Do we create a separate Citace for each species (potentially 100s)?
  • Dump all data for all species in one file for testing, and individual files for upload?
  • Should discuss with Kevin Howe

Upload Stats

  • Wen will process upload statistics as curators submit data
  • Curators can check for discrepancies right away, rather than wait for the build and submission to notice errors

Legacy info complete!

Curation Status Form

  • Will be live (on Tazendra) once some SVM reruns are done

INDI (Interesting, Not-yet-modeled Data Index)

Community Annotation of Concise Descriptions, WikiPathways

  • Concise Descriptions
    • Ranjana wrote up a template; Kimberly and Ranjana were stress testing the form
    • Goal: Useful to WB curators, specific community members (experts) for trial curation, and finally general users
    • Scripting simple descriptions, manual annotation of complex descriptions
    • Prioritizing genes that have no descriptions
    • Form should allow a user comment box to indicate data that is missing from WormBase (must have reference or data to support)
  • WikiPathways
    • How do we engage the community in WikiPathways? Prepopulating pages with lists of genes?
    • WormBase-approved models vs. community suggestions/ideas
  • We could focus on specific data types (for this year, say) and really push a public agenda to get community annotation rolling

SAB Meeting

  • Few weeks away
  • What do we present?
  • We want to discuss our process: curation pipeline, Curation Status Tool
  • Human disease relevance
  • Generate consistent theme or topic flow; divy up to people to present
  • How do we engage the community? What tools do we develop? How will they work? Educational outreach?
  • Display
    • How do we display transcription regulatory networks (TRNs)?
    • How do we display/capture pathways?
  • Topics: Multiple genomes, Natural variants, Transcriptomics, TRNs, Community Annotation, Pathways, Human relevance, Drugs/molecules

January 17, 2013

Curation Status Form

  • Expression data
    • Daniela going through and validating negative papers
    • No record of truly negative papers during Andrei's first pass
    • ~2000 papers flagged positive for other_expr but not curated
    • Papers up to year 2000 were carefully, manually scored for expression data
  • We can present the form at SAB meeting
  • Transgene, antibody, and disease are coming from Textpresso, not SVM; papers will be flagged through Curator First Pass tables
  • Microarray datatype to be added to form
    • GEO data with no reference? Ignore
  • Need to add a "Positive but not curatable" pre-canned comment

SAB Meeting agenda

  • Chris' will present flagging and curation pipeline overview
  • Human relevance
    • Diseases, disease ontology
  • "Spontaneous" updates to DB/website? Mirrors?
  • Processes and pathways
  • Transcription Regulatory Networks
    • Show Cytoscape view of regulatory interactions
    • Show GBrowse with PWMs, for example
    • Cytoscape cell lineage?
    • BioTapestry
    • Cytoscape network filters? (temporal, spatial, process)
  • OICR and Hinxton will present first
  • Kimberly can present on LEGO
  • GO curation pipeline
  • Future plans
  • Community Annotation
  • Meeting at Millikan boardroom
  • SAB Monday night dinner @ Morgan Library
  • Sunday meeting
    • Discuss Cytoscape options with OICR team

Gene Disaese OA

  • Ranjana and Juancarlos developed OA for gene/disease relationships
  • Connections to Reactome via human genes

Cori Bargmann

  • Giving talk this Tuesday 4pm
  • She'll meet WormBase Wednesday morning ~9am-10:30am

Prioritization of Allele/Phenotype curation

  • Lack of data
  • Process-based curation
  • Top priority: phenotypes for genes that have no existing data
  • Consider throughput of data sets (large scale may be less reliable/granular)

Community Concise Descriptions

  • Focus on C. briggsae and Brugia malayi

Dead genes

  • Plan is to maintain record of original genes (dead or not) referred to by objects in Postgres (and other curation databases)
  • At dump, genes will be recognized as dead and replaced where applicable
  • ACEBD and website will display only updated genes but have a "Historical_gene" tag and a remark indicating original reference to a dead gene

January 24, 2013

SAB Meeting

  • Curation pipeline
    • Flagging methods
      • SVM
      • Author First Pass
      • Curator First Pass
      • Textpresso
    • Curation progress over time
      • Where are we? How much do we have to do? What's our rate?
      • Do we reprioritize curation? Deprecate some curation?
      • Group data types by phenotype related classes
      • Compare current numbers to WS220 and WS200
      • Numbers by object # versus paper #
      • Large-scale vs small-scale data sets
      • High vs low numbers
  • Transcription Regulatory Networks and Gene Expression
    • modENCODE data in GBrowse (ChIP-Seq, RNA-Seq)
    • Cytoscape view - can filter for regulatory interactions
    • FlyBase TRN data
    • BioTapestry
    • What questions can we ask? What do we want to ask?
    • How much do we need to develop tools? Can we co-opt what has already been developed?
    • Transgenomics
  • Interactions
    • New model, consolidating interaction types
    • New genetic interaction ontology with BioGrid, SGD, etc.
    • New OA
    • Will get data from and share with BioGrid
    • Genetic interactions done in parallel with phenotype curation
    • Regulatory interactions done by Xiaodong
    • Large-scale predicted interaction datasets

  • Focus on processes and pathways
    • Process pages and Wikipathways
    • LEGO vs Wikipathways: Redundant? How are they complementary?

  • Community Annotation
  • Curation Priority & Efficiency
    • Curation by paper, data type, topic/process?
    • Should we pilot different curation pipelines?
    • What data types are more easily curated in parallel (e.g. phenotype and genetic interactions)?
    • Hybrid approach: spend some time acquiring specific (and related) data types, then switch to some other curation focus
    • Week focused on particular topic; curator jamboree
    • Experiment with different approaches: What's most fun? Efficient? Productive? Satisfying?

January 31, 2013

SAB Review

  • Blog
    • Use blog more; user advice like: how to open all a page's widgets
    • Coming soon info: new data, new displays
      • Maybe just focus on existing content for now
    • We now have XXX number of XXX objects (Expr pattern, RNAi, phenotypes, etc.)
    • Distributing tasks/efforts
    • WormBase You Tube Channel
    • Recycling notices of features: Make sure all features that we make notice of
    • Report broken things and when things are fixed
    • Need a concise, comprehensive Table of Contents for the site; Site Map to help users know about data and features
    • How is WBPerson class growing/changing?
  • Bug Fixes
    • Prioritizing bugs to be fixed
    • Everyone checks the staging site for a full day, once per release or month
    • Don't show features that are broken on site
  • Intermine priority
    • What do we need to do to get it running ASAP?
    • What is the minimal function required for going live?
    • Start with gene class, see how much we can get ready
    • June 2013 release?
    • Work with existing capabilities of Intermine, not holding out on new query features/capabilities
    • Bottom-up vs top-down approach to development?
    • What are the biological priorities for Intermine?
    • Integrate one class at a time, starting with most important data
    • Get a freeze ready by April 2013, then start testing cycles and make tutorials
  • Incremental updates
    • Easiest thing if we (Caltech) make pages here and send of to OICR to place in widget
    • Curators should decide what type of data they would like to see in incremental updates and discuss with Juancarlos
    • Concise description updates; how to handle typos?
    • Minor vs Major changes? Let's not distinguish right now
    • New data widget open by default on pages?
  • Release notes
    • Need to send Hinxton a list of items (numbers) to be reported on release notes
    • Chris will compile a list of putative release stats to report and send around
  • Resequencing strain issues
    • Curating phenotypes to strains vs. alleles, because of potential background mutations
    • Linking phenotypes to strain, but VIA a presumed allele
  • Pathways - who is an expert? Can they help?
    • Gene pre-populating in Wiki Pathways
  • Transcription Regulatory Networks
    • Can we have FlyBase-like GBrowse tracks?
    • When do we say a Transcription Factor is regulating a target gene? Corroborating evidence
  • Paper Annotation Tool
    • Make tool available at the PubMed site?
  • Expression page revamp
    • RNA-seq graphs, microarray graphs
    • Displaying genotype of transgenes on expression page
    • Displaying all anatomy terms associated with all Expr patterns
    • Yanai data as graphs/images
    • Expression clusters: re-annotating now, working on display and querying
  • Brief IDs vs. concise descriptions

Table of Contents for Future Discussion

  • Strain-to-Phenotype curation
  • Intermine
  • Incremental updates
  • Bug fix prioritization
  • Pathways
  • Brief IDs vs. concise descriptions
  • LEGO
  • Community Annotation
  • Phenotype-based curation; jamboree-style curation; Flagging papers by topic/process
  • Phenotype-to-GO issues
  • Quarterly update reports
  • Relational DB vs ACEDB
  • Hiring more developers?