WormBase-Caltech Weekly Calls April 2015

From WormBaseWiki
Revision as of 15:33, 7 May 2015 by Cgrove (talk | contribs) (Created page with "= April 2015 = == April 2, 2015 == === Adding Species information to nightly geneace dumps === ** Michael Paulini is adding Species information for the WBGene IDs in the gen...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

April 2015

April 2, 2015

Adding Species information to nightly geneace dumps

WormBase Ontology Browser WS248 online on Dev

  • Caveat: the rest of the WormBase data on the Dev site is WS247 (one release behind)
    • Would explain some discrepancies

Micropublication form

  • Daniela waiting for feedback from Paul's lab
  • Will send to Alison Frand and Elissa Hallem at UCLA to get input

Mary Ann's IWM abstract for community annotation

  • People can send comment over next day or two
  • Will make revisions and send around next week

April 9, 2015


  • Had some glitches with secondary hard drive this week, not sure what the problem is
  • /home2 directory is not intended as a primary storage; curators need to back data up
  • Paper PDFs stored on /home2 as the primary storage location
  • Papers are backed up on Athena (Wen's computer)
  • TIF files are stored on CDs
  • Main Tazendra hard drive backed up (RAIDed/mirrored), backed up nightly up to ~1 month (or 0.5 month), monthly for a year, but no older versions

Citace data upload

  • April 28th, 10am

Community Annotation Forms

  • Mary Ann's IWM (poster) abstract includes community annotation forms
  • Community annotation form list:
    • Micropublication
    • Gene Expression form (made 15 years ago); mostly not used; micropublication form could be co-opted
    • Gene description form (primarily for the community to edit the automated concise descriptions)
    • "Submit Data" list on WormBase (http://www.wormbase.org/about/userguide/submit_data#0--10)
  • Would be good to have a community annotation portal/landing page
  • Mary Ann will send around a summary of discussions about community annotation, along with prioritization of forms for curators and Juancarlos
  • Relevant curators should consider reviewing existing forms for updates


  • http://seek.princeton.edu/modSeek/
  • Princeton 6th year grad student developed it (leaving within 6 months)
  • "Seek" developed for human on microarray and RNAseq data
  • 5 model organisms in "modSeek": yeast, worm, fly, mouse, zebrafish
  • Pulls in data from Gene Expression Omnibus (GEO)
  • Analysis computed from raw data
  • Data from SPELL should be reasonably transferred to modSeek
  • We need to determine if WormBase data can be updated per release
  • Has text mining
  • Cross-species comparisons available

April 16, 2015

Topic Images

  • Daniela importing Lipid Metabolism topic images
  • Daniela may also try to create gene-image connections
  • Picture OA could capture relevant genes
  • For now, Daniela will only import images for which we have permissions

WormBook Chapters

  • Would be good to make sure relevant FTP files are mentioned and pointed to
  • We can add a section on concise description and automated description
    • Not sure yet where to put it; maybe in introduction chapter

April 23, 2015

WormBook Chapters

  • Importing figures or screenshots can result in lost resolution
  • We may need to play with image manipulation/editing to fix
  • Gene function chapter has intermine section; maybe should be an appendix or separate chapter
    • Will make separate chapter; will reference Intermine with a single screenshot in gene function chapter
  • We will make sure the gene function chapter properly links to GO chapter


  • Karen created Lipid metabolism pathway
  • Dennis Kim and Jonathan Ewbank are updating their WormBook chapter on innate immunity; working with Karen on WikiPathway


  • Kimberly participating in conference calls
  • Progress being made on back-end to handle evidence properly
  • Curators have been going through specific use cases
  • Talking about generally how to best model different biological scenarios
  • Estimate for getting back end finished on the order of weeks
  • Front-end still being developed; getting close
  • Karen asks: How will LEGO integrate with Process&Pathway curation at WormBase?
    • Kimberly: LEGO is very molecular-function-centric; may be more granular than WB Topics/Process&Pathway
    • We should see soon how these two will interface

Controlled vocabulary for institutions

  • How to handle synonyms of institution names
  • We will accept entries of synonyms, but will only actually store/track the (English) official name

April 30, 2015


  • http://seek.princeton.edu/modSeek/
  • Wen has established an account
  • Wen is running scripts to generate worm data for modSeek
  • Data in modSeek is G.E.O. (Gene Expression Omnibus) based
  • Links at modSeek go to G.E.O. data, not WormBase
  • Wen is considering a WB-specific search with links back to WB data
  • modSeek only has (and supports) C. elegans data, not other nematodes for which we have data
  • Maybe we could have a "WormBaseSeek"/"WB-Seek"?
  • We can run a mirror at WB
  • Wen is modifying SPELL scripts for modSeek
  • Would we be able to perform cross-species comparisons with a WB-Seek?
  • Textpresso-Dev is currently running our modSeek instance
    • Textpresso-Dev probably would not be able to handle mouse, human, fly data for species comparisons etc.
  • We would like to be able to perform cross-species comparisons for all nematode species as well as the other MODs/human
  • WB data is paper-centric, unlike data in modSeek
  • WB data is processed data (author processed), modSeek data is raw data
    • We want to be able to keep/maintain the processed data in addition to the raw data

BioCurator Meeting 2015 (China)

  • ~300 people in attendance, about 1/3 from outside China
  • Mouse (MGI), yeast (SGD), EBI, NCBI, and WB represented at meeting
  • Xiaodong's training session
    • Q&A about literature curation
    • Many small bio-databases in China, focused mainly on genome annotation analysis; no literature curation
    • Xiaodong demonstrated the Ontology Annotator (OA) tool
  • Currently there is no bio-data center in China; Biocuration society would like to establish one
  • Web Apollo, JBrowse, phylogenetic G.O. annotation info presented
  • Yeast database (SGD?) use IntAct (in collaboration with BioGrid) to curate protein complexes
  • There was considerable discussion about the use of Uberon as a cross-species anatomy ontology
    • Would be good for WB to establish connections to Uberon with C. elegans anatomy ontology
    • This will require specification of relationships; non-trivial
  • Yuling presented work on SVM analysis
    • People had questions about whether SVM could be used for hierarchical flagging (data types and then subtypes)
      • This requires training sets; could consider moving forward
      • (Karen) We already run SVM for allele data followed by Textpresso entity recognition
      • We would likely run the entire corpus through the SVM for a whole new datatype with proper training sets
    • Can SVM be applied at the paragraph or sentence level?
      • Possible, and we have some curated sentences saved, for example with cell component curation (CCC)
  • UniProt developing UniRule system for large scale protein annotation
    • Manually establish (text mining?) rules for future recognition and annotation
  • Reactome is beginning to use ORCID person IDs to give attribution for Reactome pathways

FTE estimates for WB staff

  • Paul S. sent around an e-mail with a spreadsheet for filling out FTE estimates
  • Paul asks that people fill it out today and send it back to him
  • Two e-mails; first with complete form, second with simpler form
  • Please fill out complete form, or simpler form if necessary