WormBase-Caltech Weekly Calls April 2014

From WormBaseWiki
Revision as of 15:54, 30 April 2014 by Cgrove (talk | contribs) (Created page with "== April 3, 2014 == === WOBr === * Testing at juancarlos.wormbase.org (from there go to Tools > Ontology Browser) * As one-offs, Raymond can take newly generated GAFs from cu...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

April 3, 2014

WOBr

  • Testing at juancarlos.wormbase.org (from there go to Tools > Ontology Browser)
  • As one-offs, Raymond can take newly generated GAFs from curators and process for testing.

Murray Expression datasets

  • Still need to acquire primary expression data from Waterston lab
  • Problems with handling inferred expression of genes in cells
  • Thus inferred expression data will be removed
  • Expression object will retain graphical representations supplied by Murray et al.
  • A friendly reminder will be sent to Waterston requesting tabulated, processed primary expression data.


April 10, 2014

Apologies

  • Karen: attending impromptu OICR group meeting, won't be on the Caltech call.

BioCurator Meeting

ISB2014_group_notes

  • many text mining tools
  • many ontology systems, mapping ontologies
    • Aging ontologies
    • Anatomy ontology mapping (Uberon)
    • Format convergence? OBO, OWL, RDF? Plugin for Protege to read OBO files
    • How extensive is OWL? How much has OWL been applied to biology? OWL is being used for reasoning
  • Community Annotation
    • CANTO (Community ANotation TOol) workshop (Wen attended)
      • PombeBase use CANTO and works well
  • Big data curation open discussion/workshop
    • Enormous datasets (e.g. genome sequencing of all cancer patients)
    • Data storage issue
    • Data stability issue
    • Clinical data, portal set up by Michael Cherry, ClinGen
  • Lincoln's talk
    • International Cancer Genome Consortium (ICGC) project (petabyte datasets)
    • 10,000 cancer patients already sequenced (18 PB by 2018)
    • Torent P2P data sharing
  • BioCreative workshop
  • Amazon "Mechanical Turk" community annotation
  • EcoCyc, MetaCyc - pathway curation (will hear from Karen)
  • Reactome
    • May be a good tool for displaying TF-target gene interaction display (vs. Cytoscape)
    • Add a Reactome display to WormBase?
  • SPELL - anything to replace SPELL? No.
    • Still lacking other tools for microarray and RNA-Seq data

Heartbleed SSL security issue

  • Raymond updated all servers/machines
  • All machines need to be rebooted
  • Any user passwords should be changed
  • Do we care about any non-root users? Best to change all passwords on machines that have been updated

BioGrid C. elegans physical interactions

  • Kimberly has been using BioGrid's curation tool
  • We have a table of C.elegans large scale genetic and physical interaction data
  • We're still trying to get all data

Timestamps

  • Curator evidence vs. history of curation
  • Kevin and Juancarlos still looking for use-cases
  • Need to consider for future database migration and potential use of a common central database
  • Curators use timestamps for culling out bogus data objects/cleaning up data

April 17, 2014

Agenda

  • BioGrid
  • AmiGO
  • Reactome
  • BioCurator Meeting
  • Fission Yeast phenotype onotology & PATO

BioGrid followup

  • What are the roadblocks to protein-protein interaction curation and incorporation into WormBase?
  • Chris, Kimberly, and Rose have communicated about C. elegans interaction data
  • We will import BioGrid interactions, non-redundantly
  • We are still sorting out some discrepancies regarding the Li et al 2004 Y2H interactome dataset

AmiGO verification manager

  • The GO verification manager creates a co-annotation matrix that highlights GO annotation intersections.
  • From the matrix, it is easy to see where there are no or many co-annotations. From this analysis, PomBase built rules for future annotation, where if an annotation is made that has already been shown to never have occurred in conjunction with another annotation, an alert is sent.
  • The curator can then reassess the annotation, either fixing it, or adjusting the rules.
  • Poster by PomBase Val Wood on GO annotation rules built through a co-annotation analysis.
  • I'm (Karen) thinking this would be a good tool for analyzing and setting rules for phenotype annotation.
http://amigo2.berkeleybop.org/cgi-bin/amigo2/matrix
https://sourceforge.net/apps/trac/pombase/wiki/MatrixProject
http://build.berkeleybop.org/job/check-shared-annotations/lastBuild/console
  • They can include worm checks in this project as well. I(Karen) don't know what we would need to do for that.
  • Karen can send along Val's poster to anyone who wants to know more

Reactome training sessions

  • Peter D'Eustacio is setting up WebEx training sessions for curating in Reactome for the different MODs- this includes WB, SGD, Flybase and PomBase, at least.
  • Right now the thought is that there will be a couple intro WebEx sessions, and eventually an in-person curate-athon (jamboree) where all curators will get together and do a shared project (place to be determined).
  • Karen will definitely be participating, but this can be open to anyone.
  • Reactome data can be imported into WikiPathways, therefore the data can make the "round trip" back to WormBase

Fission Yeast Phenotype Ontology (FYPO)/PATO

  • Should we move to PATO?
  • The FYPO has recently been published
  • You can download the ontology on the OBOfoundry here
  • (to open it in oboedit, you need to set an advance setting to allow danglers)
  • At first it seems pretty chaotic, but after a bit I found it makes a lot of sense- the branches are all different views and access points to the phenotypes.
  • The phenotypes themselves are composed of different ontologies, GO, ChEBI, CL (Cell) and PATO qualities, so terms from these ontologies are used as 'parent' terms. There is also, of course, the phenotype branch itself along with a quality branch. I think this organization actually makes the ontology more accessible to people as it lends itself to being approached from different vantage points.
  • I (Karen) think it would be good for us to invest in moving our phenotype ontology to be PATO compliant, and I think FYPO offers an excellent example for organizing it.
  • Migrating the WB phenotype ontology to PATO is not trivial

?Construct model

  • There's an issue with incorporating, for example, Mosci alleles and their display on the website

Monarch Initiative

Testing complex data in a central curation database

  • Juancarlos would like examples of potentially complex data curation to determine important data modeling considerations

April 24, 2014

Topics Curation

  • Pathway curation
    • WikiPathways, Reactome, LEGO
    • Reactome training/jamboree coming up
  • New topic: Chromosome segregation (mitotic/meiotic)
    • Annotations extensions (from GO) could be used

Gene Ontology

  • Paul Thomas visits next Tuesday, 10am PDT
    • Video conference call? Skype, Google Hangout? Join Me (www.join.me)?
    • LEGO demo
  • Suzie Lewis proposed a once-a-quarter GO-consortium conference call
  • We could talk about Textpresso curation tool
  • Textpresso YouTube (or other?) video
  • GO-central
    • Need a table/grid view, ontology annotator view
    • OWL format for LEGO database ('OWLification')
  • Paul T., Suzie L., and Paul S. will discuss syncing with UniProt in July
    • Protein2GO tool
    • Entity annotation tool
    • Central annotation and quality control
  • Protege-compatible tool for GO consortium
  • OBO-Edit will be supported until replaced
  • Currently there is no GO-enrichment tool in WormBase; can we link out to relevant/good tools?
  • Enrichment tool could be applied to other ontologies (e.g. phenotype) as well
  • PAINT tool (Phylogenetic annotation) [1]
    • Annotations of common ancestors based on shared GO annotations of homologs/paralogs etc.
    • Software has improved in last year
    • Best to do a quality control test
    • "Summer of PAINT"?
    • Tool has a user-inertia issue: easier to use when using often, harder when using rarely

WOBr status

  • Various Gene-Association Files (GAFs) pending (life stage, disease, etc.)
  • Raymond and Juancarlos making progress, should be released soon
  • Some issues; e.g. a single GO term with >7000 children (!) causing problems for Amigo and WOBr
    • We could enforce a cut-off for terms with too many annotations (for performance)
  • Currently still running on local machine (juancarlos.wormbase.org)
  • Will push to live site ASAP
  • Terms that have children will explicitly say so (to be clear to users)
  • Terms known to have no children will not have a link to expand and view the (non-existent) children
  • Terms where we don't know if they have a child or not, will apply a "?" symbol instead of just a "+" to expand and see if children exist.

SPELL

  • Michael Cherry/SGD considering rewriting SPELL

HTML format for special characters in ACEDB

  • ACEDB doesn't support special character display (in concise decriptions, for example)
  • GitHub issue [2]
  • Not clear that using HTML format ever worked
  • Daniela mentioned script "Find-characters" that would process dumped .ace files for special characters
  • Ideally characters are converted during the dump
  • Can we use Unicode for all special characters?

Strains, transgenes consistency with CGC

  • There have been discrepancies of transgene references in strains between WB and CGC
  • Karen working on cleaning up
  • Cause? E.g. authors mis-incorporating transgene info
  • Juancarlos wrote scripts to check the data across the two databases, finds discrepancies
  • Discrepancies can be sent to Aric at CGC

New England Area Parasitology/Helminthology meeting

  • Xiaodong attended, representing WormBase
  • Not many parasitologists using WB

Phentoype cross-product terms

  • e.g. "bloated", "kinker", "roller"
  • Fuzzy synonyms in PATO?
  • Gary S. using GO and PATO
  • Do we need to add quality terms to PATO?
  • Approach: first fix slim terms, map slim terms