April 2015

April 2, 2015

Adding Species information to nightly geneace dumps

- Michael Paulini is adding Species information for the WBGene IDs in the genes.ace file
- Does anyone want Species information for any of the other files
- For list of files, see: ftp://ftp.sanger.ac.uk/pub/consortia/wormbase/STAFF/mh6/nightly_geneace/
- Data in postgres in table gin_species

WormBase Ontology Browser WS248 online on Dev

Caveat: the rest of the WormBase data on the Dev site is WS247 (one release behind)
- Would explain some discrepancies

Micropublication form

Daniela waiting for feedback from Paul's lab
Will send to Alison Frand and Elissa Hallem at UCLA to get input

Mary Ann's IWM abstract for community annotation

People can send comment over next day or two
Will make revisions and send around next week

April 9, 2015

Tazendra

Had some glitches with secondary hard drive this week, not sure what the problem is
/home2 directory is not intended as a primary storage; curators need to back data up
Paper PDFs stored on /home2 as the primary storage location
Papers are backed up on Athena (Wen's computer)
TIF files are stored on CDs
Main Tazendra hard drive backed up (RAIDed/mirrored), backed up nightly up to ~1 month (or 0.5 month), monthly for a year, but no older versions

Citace data upload

April 28th, 10am

Community Annotation Forms

Mary Ann's IWM (poster) abstract includes community annotation forms
Community annotation form list:
- Micropublication
- Gene Expression form (made 15 years ago); mostly not used; micropublication form could be co-opted
- Gene description form (primarily for the community to edit the automated concise descriptions)
- "Submit Data" list on WormBase (http://www.wormbase.org/about/userguide/submit_data#0--10)
Would be good to have a community annotation portal/landing page
Mary Ann will send around a summary of discussions about community annotation, along with prioritization of forms for curators and Juancarlos
Relevant curators should consider reviewing existing forms for updates

modSeek

http://seek.princeton.edu/modSeek/
Princeton 6th year grad student developed it (leaving within 6 months)
"Seek" developed for human on microarray and RNAseq data
5 model organisms in "modSeek": yeast, worm, fly, mouse, zebrafish
Pulls in data from Gene Expression Omnibus (GEO)
Analysis computed from raw data
Data from SPELL should be reasonably transferred to modSeek
We need to determine if WormBase data can be updated per release
Has text mining
Cross-species comparisons available

April 16, 2015

Topic Images

Daniela importing Lipid Metabolism topic images
Daniela may also try to create gene-image connections
Picture OA could capture relevant genes
For now, Daniela will only import images for which we have permissions

WormBook Chapters

Would be good to make sure relevant FTP files are mentioned and pointed to
We can add a section on concise description and automated description
- Not sure yet where to put it; maybe in introduction chapter

April 23, 2015

WormBook Chapters

Importing figures or screenshots can result in lost resolution
We may need to play with image manipulation/editing to fix
Gene function chapter has intermine section; maybe should be an appendix or separate chapter
- Will make separate chapter; will reference Intermine with a single screenshot in gene function chapter
We will make sure the gene function chapter properly links to GO chapter

Wikipathway

Karen created Lipid metabolism pathway
Dennis Kim and Jonathan Ewbank are updating their WormBook chapter on innate immunity; working with Karen on WikiPathway

LEGO

Kimberly participating in conference calls
Progress being made on back-end to handle evidence properly
Curators have been going through specific use cases
Talking about generally how to best model different biological scenarios
Estimate for getting back end finished on the order of weeks
Front-end still being developed; getting close
Karen asks: How will LEGO integrate with Process&Pathway curation at WormBase?
- Kimberly: LEGO is very molecular-function-centric; may be more granular than WB Topics/Process&Pathway
- We should see soon how these two will interface

Controlled vocabulary for institutions

How to handle synonyms of institution names
We will accept entries of synonyms, but will only actually store/track the (English) official name

April 30, 2015

modSeek

http://seek.princeton.edu/modSeek/
Wen has established an account
Wen is running scripts to generate worm data for modSeek
Data in modSeek is G.E.O. (Gene Expression Omnibus) based
Links at modSeek go to G.E.O. data, not WormBase
Wen is considering a WB-specific search with links back to WB data
modSeek only has (and supports) C. elegans data, not other nematodes for which we have data
Maybe we could have a "WormBaseSeek"/"WB-Seek"?
We can run a mirror at WB
Wen is modifying SPELL scripts for modSeek
Would we be able to perform cross-species comparisons with a WB-Seek?
Textpresso-Dev is currently running our modSeek instance
- Textpresso-Dev probably would not be able to handle mouse, human, fly data for species comparisons etc.
We would like to be able to perform cross-species comparisons for all nematode species as well as the other MODs/human
WB data is paper-centric, unlike data in modSeek
WB data is processed data (author processed), modSeek data is raw data
- We want to be able to keep/maintain the processed data in addition to the raw data

BioCurator Meeting 2015 (China)

~300 people in attendance, about 1/3 from outside China
Mouse (MGI), yeast (SGD), EBI, NCBI, and WB represented at meeting
Xiaodong's training session
- Q&A about literature curation
- Many small bio-databases in China, focused mainly on genome annotation analysis; no literature curation
- Xiaodong demonstrated the Ontology Annotator (OA) tool
Currently there is no bio-data center in China; Biocuration society would like to establish one
Web Apollo, JBrowse, phylogenetic G.O. annotation info presented
Yeast database (SGD?) use IntAct (in collaboration with BioGrid) to curate protein complexes
There was considerable discussion about the use of Uberon as a cross-species anatomy ontology
- Would be good for WB to establish connections to Uberon with C. elegans anatomy ontology
- This will require specification of relationships; non-trivial
Yuling presented work on SVM analysis
- People had questions about whether SVM could be used for hierarchical flagging (data types and then subtypes)
  - This requires training sets; could consider moving forward
  - (Karen) We already run SVM for allele data followed by Textpresso entity recognition
  - We would likely run the entire corpus through the SVM for a whole new datatype with proper training sets
- Can SVM be applied at the paragraph or sentence level?
  - Possible, and we have some curated sentences saved, for example with cell component curation (CCC)
UniProt developing UniRule system for large scale protein annotation
- Manually establish (text mining?) rules for future recognition and annotation
Reactome is beginning to use ORCID person IDs to give attribution for Reactome pathways

FTE estimates for WB staff

Paul S. sent around an e-mail with a spreadsheet for filling out FTE estimates
Paul asks that people fill it out today and send it back to him
Two e-mails; first with complete form, second with simpler form
Please fill out complete form, or simpler form if necessary

WormBase-Caltech Weekly Calls April 2015

Contents

April 2015

April 2, 2015

Adding Species information to nightly geneace dumps

WormBase Ontology Browser WS248 online on Dev

Micropublication form

Mary Ann's IWM abstract for community annotation

April 9, 2015

Tazendra

Citace data upload

Community Annotation Forms

modSeek

April 16, 2015

Topic Images

WormBook Chapters

April 23, 2015

WormBook Chapters

Wikipathway

LEGO

Controlled vocabulary for institutions

April 30, 2015

modSeek

BioCurator Meeting 2015 (China)

FTE estimates for WB staff

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools