August 1, 2013

Quarterly Progress Reports

Capturing curation stats from the Curation Status form
What data types do we want to capture curation stats for that we are not currently?
We have frequent database dumps that can be read for stats
We can capture the stats table statically on a regular basis (daily)
- form at http://tazendra.caltech.edu/~postgres/cgi-bin/curation_status.cgi
- cronjob to get data from "Curation Statistics Page" button at
- /home/acedb/cron/curation_stats/get_daily_curation_stats.pl
- deposits files every day at 5am to :
- /home/acedb/cron/curation_stats/files/

RNA-Seq and Tiling Array data

Data in SPELL
Wen found a lot more non modENCODE data sets
May use SVM for expression cluster data
Gene IDs can be found from original paper or data set
Up-to-date mapping to genes is not currently done

AMIGO2 (Wormbase Ontology Browser)

Raymond and Juancarlos have taken AMIGO2 infrastructure to make an ontology browser for integration into WormBase
GO Term focus page demo
Graph view shows path to root (DAG view)
Inferred tree view shows:
- Ancestor terms, no annotation numbers
- Main term and children, with annotation numbers (inferred, term and descendant annotations)
- Annotation numbers link to list of genes
- Will not show "direct" annotations, only inferred
Sibling term displays: list parents with option to expand to see siblings of the main term
Separate expandable/collapsible tree of ontology ("Browse entire ontology")
Widget can be coded to integrate the ontology browser

Paper Categorization

Word frequency
- We chose papers from the Author First Pass (AFP) list with 'stress'
- About 40 papers in list, varied topics ('stress' is a broad term)
- Curation essentially now complete for most data types
- Expanding beyond AFP?
Chris will draw up preliminary tree of topics and send around
- We can discuss, edit, and expand as a group
- We want to 1) Collect positive and negative training papers and 2) Manually generate a list of key words to use for training
Todd proposes for paper pages on WormBase:
- Show a table of flagged data types for a paper?
- Give users a sense of where paper is in the curation pipeline

August 8, 2013

New Spica now has a closed (private) 'citace' account

citpub account is accessible to everyone with password
People can create their own spica accounts
Personal accounts are encouraged so as to avoid saving changes to citpub database

Worm Ontology Browser

Raymond has set up a server
Browsing should be faster now
Should be transferable to the Amazon cloud
Raymond will establish a WormBase development environment

Curation priorities

Paper categorization
Depth vs breadth of topics (number of papers?)
- 'Stress' has been a pilot topic, but is a very broad topic
- Will work on generating subcategories of 'Stress' on the Paper Categorization Wiki page
- Curators can analyze the Author First Pass list of 'Stress' papers as well as entire backlog/corpus
Goals of 'covering' a topic?
- 'Complete' and vetted process page, Wikipathway
- Promote 'featured processes' on WormBase for a given release
We should collect positive and negative papers (for a given topic) for SVM training

Curators should check CGIs

Submission forms and other CGIs may have been altered (only in the publicly accessible "azurebrd" account, you can see it in the URL)

August 15, 2013

Geneace

Raymond

We have geneace and can get data for nightly dump from geneace from Michael P

Amigo

Raymond, Kimberly, Paul S

Raymond working on development environment. Will host server at Caltech for now. Need to have single backend. Depends on >32GB of ram. Will try to make less memory dependent.

Kimberly tells us about the new GO Model, will send us link to wiki.

New relationships in LEGO are not fully defined yet. Need an ontology editor being developed in the direction we want. obo to owl is defined.

Expression Pattern

Xiaodong, Raymond

Xiaodong, Daniela, Raymond, Wen meeting to discuss expression pattern curation.

Bottom Up. Look at data, see how to fit with existing structure into database. Want a Top Down based on the types of data and relationships, wi ll help us find holes in data modeling.

RNAi movies

Daniela, Paul S, Raymond, Gary S

Contacted the people responsible, but the person on vacation until yesterday.

Do they have functional genotype data we don't have? Everything should be in, the paper has been curated, it's about having the movie files.

Are their pages useful? Not sure, just checked if we can take their movie files. Should we link to their pages? Igor and maybe Raymond talked about setting up the links. We can change the movie model and link through there. There is a database tag in the RNAi model. We don't have the links for other data, so we'd have to look.

Tiling array

Wen, Raymond

Hinxton not going to work on all Tiling array nor RNA seq. Wen will work on biology part of experimental objects, push Hinxton to map to genes. Will Hinxton take on migration to build. Focus on what we can do.

WormBase-Caltech Weekly Calls August 2013

Contents

August 1, 2013

August 8, 2013

August 15, 2013

Geneace

Amigo

Expression Pattern

RNAi movies

Tiling array

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools