WormBase-Caltech Weekly Calls April 2015

From WormBaseWiki
Jump to navigationJump to search

April 2015

April 2, 2015

Adding Species information to nightly geneace dumps

WormBase Ontology Browser WS248 online on Dev

  • Caveat: the rest of the WormBase data on the Dev site is WS247 (one release behind)
    • Would explain some discrepancies

Micropublication form

  • Daniela waiting for feedback from Paul's lab
  • Will send to Alison Frand and Elissa Hallem at UCLA to get input

Mary Ann's IWM abstract for community annotation

  • People can send comment over next day or two
  • Will make revisions and send around next week

April 9, 2015


  • Had some glitches with secondary hard drive this week, not sure what the problem is
  • /home2 directory is not intended as a primary storage; curators need to back data up
  • Paper PDFs stored on /home2 as the primary storage location
  • Papers are backed up on Athena (Wen's computer)
  • TIF files are stored on CDs
  • Main Tazendra hard drive backed up (RAIDed/mirrored), backed up nightly up to ~1 month (or 0.5 month), monthly for a year, but no older versions

Citace data upload

  • April 28th, 10am

Community Annotation Forms

  • Mary Ann's IWM (poster) abstract includes community annotation forms
  • Community annotation form list:
    • Micropublication
    • Gene Expression form (made 15 years ago); mostly not used; micropublication form could be co-opted
    • Gene description form (primarily for the community to edit the automated concise descriptions)
    • "Submit Data" list on WormBase (http://www.wormbase.org/about/userguide/submit_data#0--10)
  • Would be good to have a community annotation portal/landing page
  • Mary Ann will send around a summary of discussions about community annotation, along with prioritization of forms for curators and Juancarlos
  • Relevant curators should consider reviewing existing forms for updates


  • http://seek.princeton.edu/modSeek/
  • Princeton 6th year grad student developed it (leaving within 6 months)
  • "Seek" developed for human on microarray and RNAseq data
  • 5 model organisms in "modSeek": yeast, worm, fly, mouse, zebrafish
  • Pulls in data from Gene Expression Omnibus (GEO)
  • Analysis computed from raw data
  • Data from SPELL should be reasonably transferred to modSeek
  • We need to determine if WormBase data can be updated per release
  • Has text mining
  • Cross-species comparisons available

April 16, 2015

Topic Images

  • Daniela importing Lipid Metabolism topic images
  • Daniela may also try to create gene-image connections
  • Picture OA could capture relevant genes
  • For now, Daniela will only import images for which we have permissions

WormBook Chapters

  • Would be good to make sure relevant FTP files are mentioned and pointed to
  • We can add a section on concise description and automated description
    • Not sure yet where to put it; maybe in introduction chapter

April 23, 2015

WormBook Chapters

  • Importing figures or screenshots can result in lost resolution
  • We may need to play with image manipulation/editing to fix
  • Gene function chapter has intermine section; maybe should be an appendix or separate chapter
    • Will make separate chapter; will reference Intermine with a single screenshot in gene function chapter
  • We will make sure the gene function chapter properly links to GO chapter


  • Karen created Lipid metabolism pathway
  • Dennis Kim and Jonathan Ewbank are updating their WormBook chapter on innate immunity; working with Karen on WikiPathway


  • Kimberly participating in conference calls
  • Progress being made on back-end to handle evidence properly
  • Curators have been going through specific use cases
  • Talking about generally how to best model different biological scenarios
  • Estimate for getting back end finished on the order of weeks
  • Front-end still being developed; getting close
  • Karen asks: How will LEGO integrate with Process&Pathway curation at WormBase?
    • Kimberly: LEGO is very molecular-function-centric; may be more granular than WB Topics/Process&Pathway
    • We should see soon how these two will interface

Controlled vocabulary for institutions

  • How to handle synonyms of institution names
  • We will accept entries of synonyms, but will only actually store/track the (English) official name

April 30, 2015


  • http://seek.princeton.edu/modSeek/
  • Wen has established an account
  • Wen is running scripts to generate worm data for modSeek
  • Data in modSeek is G.E.O. (Gene Expression Omnibus) based
  • Links at modSeek go to G.E.O. data, not WormBase
  • Wen is considering a WB-specific search with links back to WB data
  • modSeek only has (and supports) C. elegans data, not other nematodes for which we have data
  • Maybe we could have a "WormBaseSeek"/"WB-Seek"?
  • We can run a mirror at WB
  • Wen is modifying SPELL scripts for modSeek
  • Would we be able to perform cross-species comparisons with a WB-Seek?
  • Textpresso-Dev is currently running our modSeek instance
    • Textpresso-Dev probably would not be able to handle mouse, human, fly data for species comparisons etc.
  • We would like to be able to perform cross-species comparisons for all nematode species as well as the other MODs/human
  • WB data is paper-centric, unlike data in modSeek
  • WB data is processed data (author processed), modSeek data is raw data
    • We want to be able to keep/maintain the processed data in addition to the raw data

BioCurator Meeting 2015 (China)

  • ~300 people in attendance, about 1/3 from outside China
  • Mouse (MGI), yeast (SGD), EBI, NCBI, and WB represented at meeting
  • Xiaodong's training session
    • Q&A about literature curation
    • Many small bio-databases in China, focused mainly on genome annotation analysis; no literature curation
    • Xiaodong demonstrated the Ontology Annotator (OA) tool
  • Currently there is no bio-data center in China; Biocuration society would like to establish one
  • Web Apollo, JBrowse, phylogenetic G.O. annotation info presented
  • Yeast database (SGD?) use IntAct (in collaboration with BioGrid) to curate protein complexes
  • There was considerable discussion about the use of Uberon as a cross-species anatomy ontology
    • Would be good for WB to establish connections to Uberon with C. elegans anatomy ontology
    • This will require specification of relationships; non-trivial
  • Yuling presented work on SVM analysis
    • People had questions about whether SVM could be used for hierarchical flagging (data types and then subtypes)
      • This requires training sets; could consider moving forward
      • (Karen) We already run SVM for allele data followed by Textpresso entity recognition
      • We would likely run the entire corpus through the SVM for a whole new datatype with proper training sets
    • Can SVM be applied at the paragraph or sentence level?
      • Possible, and we have some curated sentences saved, for example with cell component curation (CCC)
  • UniProt developing UniRule system for large scale protein annotation
    • Manually establish (text mining?) rules for future recognition and annotation
  • Reactome is beginning to use ORCID person IDs to give attribution for Reactome pathways

FTE estimates for WB staff

  • Paul S. sent around an e-mail with a spreadsheet for filling out FTE estimates
  • Paul asks that people fill it out today and send it back to him
  • Two e-mails; first with complete form, second with simpler form
  • Please fill out complete form, or simpler form if necessary