WormBase-Caltech Weekly Calls September 2014

September 4, 2014

I need to leave at 5.30pm (ermm, 9.30am CA time?) Mary Ann

Concise Descriptions

Automated descriptions to go in for WS245

New Upload Schedule

Delayed a couple weeks compared to original schedule
Official citace upload to Hinxton on October 10th
We can/should upload our data Wednesday before SAB trip (October 1st) to Hinxton
Wen needs queries to include in Citace Upload summary by October 1st
Upload contingent on models freeze

Data submission as part of publication process

eLife considering micro-publication, addendums to papers (individual add-on experimental results)
Can certain data be required to publish? Sequence info, etc. ?
Could there be a pilot with a specific publisher (like GSA markup)?

Elixir

http://www.elixir-europe.org/
Use RDF (Resource Description Framework) triples
Checking individual statements/sentences from literature for data presence/absence in database

New tags in Qualifier Hash

Life_stage and Anatomy_term
Adding to enable annotation of EPIC data
Couples (or attempts to) time-and-space (life_stage-and-anatomy) annotation of expression pattern
Can ambiguities be captured?
This approach (bit of a kludge) introduces some denormalization (normalization can be automated later)

LEGO Curation

Setting up connection to Minerva
Juancarlos working with Seth, Chris, Heiko to debug setup
Would be good (necessary?) to establish a working protocol for collaboration
Raymond's LEGO-like approach to curating anatomy function
- Annotate a phenotype by annotating relevant DB objects, e.g. anatomy term, GO term, etc. as well as context/condition
- Use minimal relationships (relationship ontologies complicated and difficult to use)

September 11, 2014

SAB Meeting

Can we start putting together a more detailed agenda, at least for Caltech?
Would be good to decide on our talk topics so we can begin putting our presentation(s) together.
Curation Stats numbers spreadsheet
- Good to capture amount of time (FTEs) on curation, but also software development, curation tools, pipelines, data modeling, help desk, fixing old data
Would be good to have a rough breakdown of every curator's FTE breakdown
Allocation of resources
Ontology development; how much time is spent? Is it worth it?
What tools do we have, or could we develop, that could substantially improve efficiency/effectiveness of curation? Example: sequence generation tool
What are considerations for future database migration? We should account for migration delays to curation, etc.
The curation database (like Postgres now) may or may not be the same database that drives the website
Are our curation pipelines capturing sufficient detail (or too much, unnecessary detail)?
Is it worth capturing negative data?

SAB Talk Proposals

Nomenclature - not stats, but what we do, how it's done, communication etc Mary Ann
Sequence Feature - developments Mary Ann
Physical Interaction Curation - a relatively new data type for us, discuss existing data, strategies for going forward, what groups we could/do collaborate with, what files we could provide
Community-Assisted Curation - what we currently do (author first pass, data submission forms), what more we could do (CANTO)
Topic-Based Curation

September 18, 2014

Epic Data

Daniela will test once Paul tags models (tomorrow Friday Sept 19th, 2014) and then upload to citace minus
Wen needs a couple days from model tagging to prepare the final upload

SAB

Chris (Caltech curation overview)
- Pipeline/Mission of Caltech (pull biological data from papers and put into database)
- Curation data types
- Who is who, who does what? (Photos of curators?)
- Curation stats, what is up to date, what needs more effort?
- Rate-limiting steps, tools (OA, curation status form, etc.) (slide from Daniela)
- Topic curation, pathways curation
- Brief statement on curation of other nematode species
Wen (Expression Clusters)
- Couple slides on expression curation from Daniela
- What are expression clusters? Come from microarray, proteomics, tiling array
- Triage, pattern matching
- SPELL tool: what is it? usage?
- Display of expression cluster data
- WormBase generated expression clusters (custom algorithm?)
- Enrichment of GO terms and anatomy terms (segue into WOBr talk)
Raymond (WormBase ontology browser)
- We use and develop a number of ontologies
- The ontological structure allows hierarchical browsing and reasoning
- Co-opting Amigo2 (existing tool)
- OWL formats
- Future developments
Mary Ann (Nomenclature & Sequence features)
- Sequence features
  - Sequence feature data display
  - JBrowse/GBrowse integration
- Nomenclature
  - CRISPR alleles
Ranjana (Human Disease)
- Update
- Two classes of disease models: Genes (via variations) & transgenes (overexperssion, deleterious repeats, etc.)
- ~260 gene models for human disease
- Drugs & therapeutic compounds, metal toxicity (toxicology in general)
- Disease portal? Toxicology portal?
- Toxins in Textpresso?
Kimberly (GO curation)
- Enrichment analysis
- Priorities for GO
- Annotation extensions & LEGO curation

September 25, 2014

SAB Preparations

Review of overview presentation
What data are we not curating from the newest elegans/nematode papers?
- Papers per year is roughly flat per datatype per year but total number of papers is going up ~10% each year
How many minutes should each CIT curator get to present?
Karen will present Process/Topic slides (take over from Chris)
Chris - Curation Overview, Ranjana - Disease, Raymond - WOBr, Wen- Expression datasets, Kimberly - GO, Mary Ann - Nomenclature, Sequence

New Backups

Raymond setup backups through Duplicity
Tazendra, Mangolassi, Athena, Canopus, other Linux machines backed up
Backup is limited to two years; any data removed from a machine earlier than 2 years ago will be lost

Sequence Feature OA

Currently read only, makes data available to curators
Populated OA on sandbox (Mangolassi); skeleton form on Tazendra

WormBase-Caltech Weekly Calls September 2014

Contents

September 4, 2014

Concise Descriptions

New Upload Schedule

Data submission as part of publication process

Elixir

New tags in Qualifier Hash

LEGO Curation

September 11, 2014

SAB Meeting

SAB Talk Proposals

September 18, 2014

Epic Data

SAB

September 25, 2014

SAB Preparations

New Backups

Sequence Feature OA

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools