Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchm |
|||
Line 221: | Line 221: | ||
**Week focused on particular topic; curator jamboree | **Week focused on particular topic; curator jamboree | ||
**Experiment with different approaches: What's most fun? Efficient? Productive? Satisfying? | **Experiment with different approaches: What's most fun? Efficient? Productive? Satisfying? | ||
+ | |||
+ | |||
+ | |||
+ | = January 31, 2013 = |
Revision as of 19:01, 31 January 2013
January 10, 2013
Site-of-action pages
- No cell function pages exist
- Tried to display cell function info on anatomy pages
- Trying to display site-of-action data on relevant gene pages
Process pages
- Rudimentary process page up
- Has overview widget with all relevant entities
- Should be something by SAB
Intermine
- JD is working on; should have something ready by SAB
- Will replace WormMart entirely
- We should make an announcement on the main WormBase page that WormMart will remain at WS220 and Intermine will replace it
- We can perform queries for users in the meantime, if necessary
- Intermine interface isn't very intuitive; maybe we can improve it in time
- Can build queries with a query builder and perform complex queries
- Precanned queries could be made and proposed to users
- Curators can try YeastMine or FlyMine to try it out
Species expression data from Itai Yanai
- Importing pictures; potentially flagging each to a particular species
- Pictures typically linked to Expr_pattern objects
- Can images be linked to sequences, rather than gene objects?
Documentation for Protein-2-GO tool
Brugia database
- Mark Blaxter offered to donate his Brugia database into WormBase
- Papers from 1600s!!!
Nematode Textpresso
- ~10,000 new papers available (open access?)
Species in Phenotype curation
- Species tag in phenotype
- Default dumping with species as C. elegans; some data was removed during the dump
- Do phenotypes need a species tag? We should add species tag to everything, to be safe
- Do we create a separate Citace for each species (potentially 100s)?
- Dump all data for all species in one file for testing, and individual files for upload?
- Should discuss with Kevin Howe
Upload Stats
- Wen will process upload statistics as curators submit data
- Curators can check for discrepancies right away, rather than wait for the build and submission to notice errors
Legacy info complete!
Curation Status Form
- Will be live (on Tazendra) once some SVM reruns are done
INDI (Interesting, Not-yet-modeled Data Index)
- Wiki page generated to capture data types that do not fit cleanly into current data models
- http://wiki.wormbase.org/index.php/INDI
Community Annotation of Concise Descriptions, WikiPathways
- Concise Descriptions
- Ranjana wrote up a template; Kimberly and Ranjana were stress testing the form
- Goal: Useful to WB curators, specific community members (experts) for trial curation, and finally general users
- Scripting simple descriptions, manual annotation of complex descriptions
- Prioritizing genes that have no descriptions
- Form should allow a user comment box to indicate data that is missing from WormBase (must have reference or data to support)
- WikiPathways
- How do we engage the community in WikiPathways? Prepopulating pages with lists of genes?
- WormBase-approved models vs. community suggestions/ideas
- We could focus on specific data types (for this year, say) and really push a public agenda to get community annotation rolling
SAB Meeting
- Few weeks away
- What do we present?
- We want to discuss our process: curation pipeline, Curation Status Tool
- Human disease relevance
- Generate consistent theme or topic flow; divy up to people to present
- How do we engage the community? What tools do we develop? How will they work? Educational outreach?
- Display
- How do we display transcription regulatory networks (TRNs)?
- How do we display/capture pathways?
- Topics: Multiple genomes, Natural variants, Transcriptomics, TRNs, Community Annotation, Pathways, Human relevance, Drugs/molecules
January 17, 2013
Curation Status Form
- Expression data
- Daniela going through and validating negative papers
- No record of truly negative papers during Andrei's first pass
- ~2000 papers flagged positive for other_expr but not curated
- Papers up to year 2000 were carefully, manually scored for expression data
- We can present the form at SAB meeting
- Transgene, antibody, and disease are coming from Textpresso, not SVM; papers will be flagged through Curator First Pass tables
- Microarray datatype to be added to form
- GEO data with no reference? Ignore
- Need to add a "Positive but not curatable" pre-canned comment
SAB Meeting agenda
- Chris' will present flagging and curation pipeline overview
- Human relevance
- Diseases, disease ontology
- "Spontaneous" updates to DB/website? Mirrors?
- Processes and pathways
- Transcription Regulatory Networks
- Show Cytoscape view of regulatory interactions
- Show GBrowse with PWMs, for example
- Cytoscape cell lineage?
- BioTapestry
- Cytoscape network filters? (temporal, spatial, process)
- OICR and Hinxton will present first
- Kimberly can present on LEGO
- GO curation pipeline
- Future plans
- Community Annotation
- Meeting at Millikan boardroom
- SAB Monday night dinner @ Morgan Library
- Sunday meeting
- Discuss Cytoscape options with OICR team
Gene Disaese OA
- Ranjana and Juancarlos developed OA for gene/disease relationships
- Connections to Reactome via human genes
Cori Bargmann
- Giving talk this Tuesday 4pm
- She'll meet WormBase Wednesday morning ~9am-10:30am
Prioritization of Allele/Phenotype curation
- Lack of data
- Process-based curation
- Top priority: phenotypes for genes that have no existing data
- Consider throughput of data sets (large scale may be less reliable/granular)
Community Concise Descriptions
- Focus on C. briggsae and Brugia malayi
Dead genes
- Plan is to maintain record of original genes (dead or not) referred to by objects in Postgres (and other curation databases)
- At dump, genes will be recognized as dead and replaced where applicable
- ACEBD and website will display only updated genes but have a "Historical_gene" tag and a remark indicating original reference to a dead gene
January 24, 2013
SAB Meeting
- Curation pipeline
- Flagging methods
- SVM
- Author First Pass
- Curator First Pass
- Textpresso
- Curation progress over time
- Where are we? How much do we have to do? What's our rate?
- Do we reprioritize curation? Deprecate some curation?
- Group data types by phenotype related classes
- Compare current numbers to WS220 and WS200
- Numbers by object # versus paper #
- Large-scale vs small-scale data sets
- High vs low numbers
- Flagging methods
- Transcription Regulatory Networks and Gene Expression
- modENCODE data in GBrowse (ChIP-Seq, RNA-Seq)
- Cytoscape view - can filter for regulatory interactions
- FlyBase TRN data
- BioTapestry
- What questions can we ask? What do we want to ask?
- How much do we need to develop tools? Can we co-opt what has already been developed?
- Transgenomics
- Interactions
- New model, consolidating interaction types
- New genetic interaction ontology with BioGrid, SGD, etc.
- New OA
- Will get data from and share with BioGrid
- Genetic interactions done in parallel with phenotype curation
- Regulatory interactions done by Xiaodong
- Large-scale predicted interaction datasets
- Focus on processes and pathways
- Process pages and Wikipathways
- LEGO vs Wikipathways: Redundant? How are they complementary?
- Community Annotation
- Curation Priority & Efficiency
- Curation by paper, data type, topic/process?
- Should we pilot different curation pipelines?
- What data types are more easily curated in parallel (e.g. phenotype and genetic interactions)?
- Hybrid approach: spend some time acquiring specific (and related) data types, then switch to some other curation focus
- Week focused on particular topic; curator jamboree
- Experiment with different approaches: What's most fun? Efficient? Productive? Satisfying?