Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchLine 321: | Line 321: | ||
− | Postgres info table (on Wiki page) | + | Postgres info table (on [[OA_forms,_tables,_scripts,_etc|Wiki page]]) |
*Curators will continue to add info to Wiki on scripts, dependencies, ontologies, etc. for each Ontology Annotator | *Curators will continue to add info to Wiki on scripts, dependencies, ontologies, etc. for each Ontology Annotator |
Revision as of 18:00, 7 February 2013
Contents
January 10, 2013
Site-of-action pages
- No cell function pages exist
- Tried to display cell function info on anatomy pages
- Trying to display site-of-action data on relevant gene pages
Process pages
- Rudimentary process page up
- Has overview widget with all relevant entities
- Should be something by SAB
Intermine
- JD is working on; should have something ready by SAB
- Will replace WormMart entirely
- We should make an announcement on the main WormBase page that WormMart will remain at WS220 and Intermine will replace it
- We can perform queries for users in the meantime, if necessary
- Intermine interface isn't very intuitive; maybe we can improve it in time
- Can build queries with a query builder and perform complex queries
- Precanned queries could be made and proposed to users
- Curators can try YeastMine or FlyMine to try it out
Species expression data from Itai Yanai
- Importing pictures; potentially flagging each to a particular species
- Pictures typically linked to Expr_pattern objects
- Can images be linked to sequences, rather than gene objects?
Documentation for Protein-2-GO tool
Brugia database
- Mark Blaxter offered to donate his Brugia database into WormBase
- Papers from 1600s!!!
Nematode Textpresso
- ~10,000 new papers available (open access?)
Species in Phenotype curation
- Species tag in phenotype
- Default dumping with species as C. elegans; some data was removed during the dump
- Do phenotypes need a species tag? We should add species tag to everything, to be safe
- Do we create a separate Citace for each species (potentially 100s)?
- Dump all data for all species in one file for testing, and individual files for upload?
- Should discuss with Kevin Howe
Upload Stats
- Wen will process upload statistics as curators submit data
- Curators can check for discrepancies right away, rather than wait for the build and submission to notice errors
Legacy info complete!
Curation Status Form
- Will be live (on Tazendra) once some SVM reruns are done
INDI (Interesting, Not-yet-modeled Data Index)
- Wiki page generated to capture data types that do not fit cleanly into current data models
- http://wiki.wormbase.org/index.php/INDI
Community Annotation of Concise Descriptions, WikiPathways
- Concise Descriptions
- Ranjana wrote up a template; Kimberly and Ranjana were stress testing the form
- Goal: Useful to WB curators, specific community members (experts) for trial curation, and finally general users
- Scripting simple descriptions, manual annotation of complex descriptions
- Prioritizing genes that have no descriptions
- Form should allow a user comment box to indicate data that is missing from WormBase (must have reference or data to support)
- WikiPathways
- How do we engage the community in WikiPathways? Prepopulating pages with lists of genes?
- WormBase-approved models vs. community suggestions/ideas
- We could focus on specific data types (for this year, say) and really push a public agenda to get community annotation rolling
SAB Meeting
- Few weeks away
- What do we present?
- We want to discuss our process: curation pipeline, Curation Status Tool
- Human disease relevance
- Generate consistent theme or topic flow; divy up to people to present
- How do we engage the community? What tools do we develop? How will they work? Educational outreach?
- Display
- How do we display transcription regulatory networks (TRNs)?
- How do we display/capture pathways?
- Topics: Multiple genomes, Natural variants, Transcriptomics, TRNs, Community Annotation, Pathways, Human relevance, Drugs/molecules
January 17, 2013
Curation Status Form
- Expression data
- Daniela going through and validating negative papers
- No record of truly negative papers during Andrei's first pass
- ~2000 papers flagged positive for other_expr but not curated
- Papers up to year 2000 were carefully, manually scored for expression data
- We can present the form at SAB meeting
- Transgene, antibody, and disease are coming from Textpresso, not SVM; papers will be flagged through Curator First Pass tables
- Microarray datatype to be added to form
- GEO data with no reference? Ignore
- Need to add a "Positive but not curatable" pre-canned comment
SAB Meeting agenda
- Chris' will present flagging and curation pipeline overview
- Human relevance
- Diseases, disease ontology
- "Spontaneous" updates to DB/website? Mirrors?
- Processes and pathways
- Transcription Regulatory Networks
- Show Cytoscape view of regulatory interactions
- Show GBrowse with PWMs, for example
- Cytoscape cell lineage?
- BioTapestry
- Cytoscape network filters? (temporal, spatial, process)
- OICR and Hinxton will present first
- Kimberly can present on LEGO
- GO curation pipeline
- Future plans
- Community Annotation
- Meeting at Millikan boardroom
- SAB Monday night dinner @ Morgan Library
- Sunday meeting
- Discuss Cytoscape options with OICR team
Gene Disaese OA
- Ranjana and Juancarlos developed OA for gene/disease relationships
- Connections to Reactome via human genes
Cori Bargmann
- Giving talk this Tuesday 4pm
- She'll meet WormBase Wednesday morning ~9am-10:30am
Prioritization of Allele/Phenotype curation
- Lack of data
- Process-based curation
- Top priority: phenotypes for genes that have no existing data
- Consider throughput of data sets (large scale may be less reliable/granular)
Community Concise Descriptions
- Focus on C. briggsae and Brugia malayi
Dead genes
- Plan is to maintain record of original genes (dead or not) referred to by objects in Postgres (and other curation databases)
- At dump, genes will be recognized as dead and replaced where applicable
- ACEBD and website will display only updated genes but have a "Historical_gene" tag and a remark indicating original reference to a dead gene
January 24, 2013
SAB Meeting
- Curation pipeline
- Flagging methods
- SVM
- Author First Pass
- Curator First Pass
- Textpresso
- Curation progress over time
- Where are we? How much do we have to do? What's our rate?
- Do we reprioritize curation? Deprecate some curation?
- Group data types by phenotype related classes
- Compare current numbers to WS220 and WS200
- Numbers by object # versus paper #
- Large-scale vs small-scale data sets
- High vs low numbers
- Flagging methods
- Transcription Regulatory Networks and Gene Expression
- modENCODE data in GBrowse (ChIP-Seq, RNA-Seq)
- Cytoscape view - can filter for regulatory interactions
- FlyBase TRN data
- BioTapestry
- What questions can we ask? What do we want to ask?
- How much do we need to develop tools? Can we co-opt what has already been developed?
- Transgenomics
- Interactions
- New model, consolidating interaction types
- New genetic interaction ontology with BioGrid, SGD, etc.
- New OA
- Will get data from and share with BioGrid
- Genetic interactions done in parallel with phenotype curation
- Regulatory interactions done by Xiaodong
- Large-scale predicted interaction datasets
- Focus on processes and pathways
- Process pages and Wikipathways
- LEGO vs Wikipathways: Redundant? How are they complementary?
- Community Annotation
- Curation Priority & Efficiency
- Curation by paper, data type, topic/process?
- Should we pilot different curation pipelines?
- What data types are more easily curated in parallel (e.g. phenotype and genetic interactions)?
- Hybrid approach: spend some time acquiring specific (and related) data types, then switch to some other curation focus
- Week focused on particular topic; curator jamboree
- Experiment with different approaches: What's most fun? Efficient? Productive? Satisfying?
January 31, 2013
SAB Review
- Blog
- Use blog more; user advice like: how to open all a page's widgets
- Coming soon info: new data, new displays
- Maybe just focus on existing content for now
- We now have XXX number of XXX objects (Expr pattern, RNAi, phenotypes, etc.)
- Distributing tasks/efforts
- WormBase You Tube Channel
- Recycling notices of features: Make sure all features that we make notice of
- Report broken things and when things are fixed
- Need a concise, comprehensive Table of Contents for the site; Site Map to help users know about data and features
- How is WBPerson class growing/changing?
- Bug Fixes
- Prioritizing bugs to be fixed
- Everyone checks the staging site for a full day, once per release or month
- Don't show features that are broken on site
- Intermine priority
- What do we need to do to get it running ASAP?
- What is the minimal function required for going live?
- Start with gene class, see how much we can get ready
- June 2013 release?
- Work with existing capabilities of Intermine, not holding out on new query features/capabilities
- Bottom-up vs top-down approach to development?
- What are the biological priorities for Intermine?
- Integrate one class at a time, starting with most important data
- Get a freeze ready by April 2013, then start testing cycles and make tutorials
- Incremental updates
- Easiest thing if we (Caltech) make pages here and send of to OICR to place in widget
- Curators should decide what type of data they would like to see in incremental updates and discuss with Juancarlos
- Concise description updates; how to handle typos?
- Minor vs Major changes? Let's not distinguish right now
- New data widget open by default on pages?
- Release notes
- Need to send Hinxton a list of items (numbers) to be reported on release notes
- Chris will compile a list of putative release stats to report and send around
- Resequencing strain issues
- Curating phenotypes to strains vs. alleles, because of potential background mutations
- Linking phenotypes to strain, but VIA a presumed allele
- Pathways - who is an expert? Can they help?
- Gene pre-populating in Wiki Pathways
- Transcription Regulatory Networks
- Can we have FlyBase-like GBrowse tracks?
- When do we say a Transcription Factor is regulating a target gene? Corroborating evidence
- Paper Annotation Tool
- Make tool available at the PubMed site?
- Expression page revamp
- RNA-seq graphs, microarray graphs
- Displaying genotype of transgenes on expression page
- Displaying all anatomy terms associated with all Expr patterns
- Yanai data as graphs/images
- Expression clusters: re-annotating now, working on display and querying
- Brief IDs vs. concise descriptions
Table of Contents for Future Discussion
- Strain-to-Phenotype curation
- Intermine
- Incremental updates
- Bug fix prioritization
- Pathways
- Brief IDs vs. concise descriptions
- LEGO
- Community Annotation
- Phenotype-based curation; jamboree-style curation; Flagging papers by topic/process
- Phenotype-to-GO issues
- Quarterly update reports
- Relational DB vs ACEDB
- Hiring more developers?
February 7, 2013
Person Cytoscape
- Person-person network viewing on website
- Use person lineages
- Will discuss with web team
WormBook History
- Paul editing a person history section, starting with John White
- Can use history info to link people
Binary vs. Non-binary interaction display
- Cytoscape and interaction table are only displaying interactions as binary interactions, regardless of number of interactors
- We need to assess how many interactions we have that have more than 2 interactors
- Once we have assessed that, we should discuss with web team about how to display these cases
Curation Statistics
- Wen can generate numbers for Citace AND the complete build (rather than Hinxton)
- End-of-build summary for curators (with more database technical numbers) separate from the Release Notes for public (with more biologically interesting numbers)
- Will separate Wiki page into two sections to reflect this
Postgres info table (on Wiki page)
- Curators will continue to add info to Wiki on scripts, dependencies, ontologies, etc. for each Ontology Annotator