WormBase-Caltech Weekly Calls March 2014

March 6, 2014

Agenda

?Construct model
SPELL

SPELL

WS242 started to have miRNA datasets; very short probes, not uniquely mapped
Wen made modifications to SPELL to accommodate
Wen has started separating datasets from papers
miRNA analyses have used completely different platforms than protein-coding genes, so requires a different dataset for each
Wen started loading WB topics into SPELL
Wen would like assign ~100 papers in SPELL to WB topics
Wen also modified SPELL script to annotate datasets applicable to specific tissues or cells
Wen may/will co-opt Juancarlos' SPELL instance for testing

?Construct model

New ?Construct model will be vetted and approved for early in the WS244 curation cycle
Some changes to the ?Construct model proposed over last week
Still discussing how to handle plasmids/vectors, whether they should go in the ?Clone class (there is precedence)
We should defer to the Sequence Ontology, wherever possible, for definitions and relationships
We will go ahead with considering including all plasmids and vectors (e.g. AddGene & Fire vectors) to the ?Clone class which can then be referenced within the ?Construct class
Will change the "Identical_transgene"/"Identical_variation" tags to "Corresponding_transgene"/"Corresponding_variation"
If necessary, we will consider adding a ?Construct tag to the ?Interaction model to accommodate annotation of constructs used in in vitro physical interaction experiments; this is pending curation of (sufficient amounts of) the relevant data

March 13, 2014

Agenda

?Infection_assay model
?Construct model (do we need to discuss?)
- From my side there no need to discuss further if we can keep the Clone tag in the ?Construct model and as far as plasmids are continued to be curated in the ?Clone class. -Daniela
- Clone wiki from PaulD: http://wiki.wormbase.org/index.php/WormBase_Model:Clone#Class_contents_WS242
Database Future Mtg report

Database Meeting Summary

Write up <https://www.dropbox.com/s/fr9qrsbup9djx4z/DatabaseFutureMeetingatOICR.pdf>.
Working group from three sites will test candidate technologies against metrics, considerations and requirements and report by Oct 2014.
Not clear yet whether everyone (in WormBase) will use one universal database technology, or if each site might use different databases
Site requirements: web speed & performance, model flexibility with regular updates, understandability of modeling language/structure
Different options: relational, row, column, NoSQL, SQL, object-oriented, graph database (Neo4J), DynamoDB (Amazon NoSQL DB)
Central database for everyone? Real-time editing and updating, or regular updating/synchronization
Process will take ~2 years or so

Data visualization with Santiago Lombeyda

Will play with some potentially straight-forward display options
Virtual worm renderings made into SVGs for layering and clicking/linking

?Infection_assay model

Broaden scope to include all types of species-species interaction (?Interspecies_interaction class ???)
Remove "Modifying_influence" tag and "Required*" tags in favor of "Resistance" and "Hypersensitivity" sets of tags
If we are going to include many types of species-species interactions, we need to consider how to make tag names that are unambiguous with respect to which species is playing a particular role
We will talk to parasite curators to see what we want to include in the model
Ranjana would like to add a concise-description-like text description to genes describing their role in infection
- Will require retroactively making database connections once the ?Infection_assay (or equivalent) model is finalized and implemented (not this upcoming release)

WormBase Topic/Process hierarchy/ontology relationships

Karen will send around a working OBO file
Curators can look at which topics should be related to others (via parent-child relationships)
Also, we can look at trying to tie in to existing GO terms

March 20, 2014

Agenda

Pad 0s in WBPerson objects. https://github.com/WormBase/website/issues/2522#issuecomment-38077230
Database discussions
GO Meeting update

New Database Discussions

We want speed, efficiency, performance
Will keep WB 2.0 web architecture
One big database for everyone?
Timestamps would not be kept for all annotations (performance, economy issues)
We still keep "Date_last_updated" and "Curator_confirmed" in ACEDB in the #Evidence hash
We would want to specify when and where we would want to keep (and track) timestamps
ACEDB timestamps have been proven useful

Padding WBPerson ID numbers with zeros

Laboratory search for "Raymond Chan" did not produce the desired/expected results
WBPerson ID is indexed but not the text string of the affiliated person
In this case WBPerson98 is listed, but automated searching for the full text person name was affected by the return of many WBPerson98* results
How much work would be required to change the WBPerson ID padding? Not trivial
We'll clarify what we think should be the user experience and Abby can decide the best way to fix it
We should thoroughly go through classes and clarify what fields should be indexed

GO Meeting update (from Paul Sternberg and Kimberly Van Auken)

Chris Mungall presenting/discussing common annotation tool
Table view annotator
Graph view annotator (LEGO/ORION) (example: http://go-genkisugi.rhcloud.com/seed/model/gomodel:goa_human-5323da180000002)
Tree view, PAINT tool used
Text, Paper viewer (SAB was excited about)
Protein-2-GO tool: dumped or restructured/repurposed?
Protein-2-GO open source? Should be open to further development
OBO-Edit is not being supported, changes from OBO to OWL format
Switching from OBO-Edit to Protege (will take time, ~6 months development)
CANTO (PomBase)
- http://curation.pombase.org/
- Online annotation tool, making use of ontologies
AMIGO2 now live on GO website
http://beta.geneontology.org/
Enrichment analysis (from PANTHER database)

Term Genie

New tool
Relies on logical definitions for GO terms
Web-based form for adding GO annotations when you can create logical definitions (explicit use of defined relationships)
E.G. response to chemical (when chemical has a ChEBI ID), etc.
Great for requesting new terms
http://go.termgenie.org/
http://code.google.com/p/termgenie/

Disease curation

FlyBase went live with disease data
We want to continue to have (and maybe extend upon) our connections/links to and from OMIM
We could work on WormBase/OMIM portal

March 27, 2014

Agenda

Removing NOT phenotype from Phenotype GAF file.
- Michael's comment: Removing it is easy, but I don't know what the position of the GO curators on that is. Also I think the negative phenotypes are quite useful to have.
- Kevin's comment: Indeed. Conversely, if negative phenotypes are uninformative and confusing, then shouldn't we remove them across the board? i.e. from the GAF file and the database and web displays?
WOBr ready for testing
- Enter WOBr issues here

Disease data in WOBr

Ranjana will inform EBI how to generate Gene Association File (GAF)
Once GAF is generated, will send to Raymond for inclusion into WOBr

Life stage Gene Association File (GAF)

Will need to also inform EBI on how to generate GAF
Daniela can e-mail Raymond to include GAF for WOBr

Removing NOTs from Gene Association Files (GAFs)

Phenotype GAF has NOTs for unobserved phenotypes
Should we remove NOTs from GAF?
Conclusion: We will leave the GAFs as they are and perform a pre-filtering step during WOBr data incorporation to exclude NOT phenotype annotations (for now)
Something to consider for future: Separate NOTs into their own GAF?
Will need to know/understand what are all of the downstream uses of the GAFs?

Parasites

Not clear what input we would want to get from parasite curators and Matt Berriman's group
Depends on the ?Infection_model class and whether we are planning on expanding to Nematode_species-to-species interactions

WormBase-Caltech Weekly Calls March 2014

Contents

March 6, 2014

Agenda

SPELL

?Construct model

March 13, 2014

Agenda

Database Meeting Summary

Data visualization with Santiago Lombeyda

?Infection_assay model

WormBase Topic/Process hierarchy/ontology relationships

March 20, 2014

Agenda

New Database Discussions

Padding WBPerson ID numbers with zeros

GO Meeting update (from Paul Sternberg and Kimberly Van Auken)

Term Genie

Disease curation

March 27, 2014

Agenda

Disease data in WOBr

Life stage Gene Association File (GAF)

Removing NOTs from Gene Association Files (GAFs)

Parasites

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools