WormBase-Caltech Weekly Calls April 2015
From WormBaseWiki
Jump to navigationJump to searchContents
April 2015
April 2, 2015
Adding Species information to nightly geneace dumps
- Michael Paulini is adding Species information for the WBGene IDs in the genes.ace file
- Does anyone want Species information for any of the other files
- For list of files, see: ftp://ftp.sanger.ac.uk/pub/consortia/wormbase/STAFF/mh6/nightly_geneace/
- Data in postgres in table gin_species
WormBase Ontology Browser WS248 online on Dev
- Caveat: the rest of the WormBase data on the Dev site is WS247 (one release behind)
- Would explain some discrepancies
Micropublication form
- Daniela waiting for feedback from Paul's lab
- Will send to Alison Frand and Elissa Hallem at UCLA to get input
Mary Ann's IWM abstract for community annotation
- People can send comment over next day or two
- Will make revisions and send around next week
April 9, 2015
Tazendra
- Had some glitches with secondary hard drive this week, not sure what the problem is
- /home2 directory is not intended as a primary storage; curators need to back data up
- Paper PDFs stored on /home2 as the primary storage location
- Papers are backed up on Athena (Wen's computer)
- TIF files are stored on CDs
- Main Tazendra hard drive backed up (RAIDed/mirrored), backed up nightly up to ~1 month (or 0.5 month), monthly for a year, but no older versions
Citace data upload
- April 28th, 10am
Community Annotation Forms
- Mary Ann's IWM (poster) abstract includes community annotation forms
- Community annotation form list:
- Micropublication
- Gene Expression form (made 15 years ago); mostly not used; micropublication form could be co-opted
- Gene description form (primarily for the community to edit the automated concise descriptions)
- "Submit Data" list on WormBase (http://www.wormbase.org/about/userguide/submit_data#0--10)
- Would be good to have a community annotation portal/landing page
- Mary Ann will send around a summary of discussions about community annotation, along with prioritization of forms for curators and Juancarlos
- Relevant curators should consider reviewing existing forms for updates
modSeek
- http://seek.princeton.edu/modSeek/
- Princeton 6th year grad student developed it (leaving within 6 months)
- "Seek" developed for human on microarray and RNAseq data
- 5 model organisms in "modSeek": yeast, worm, fly, mouse, zebrafish
- Pulls in data from Gene Expression Omnibus (GEO)
- Analysis computed from raw data
- Data from SPELL should be reasonably transferred to modSeek
- We need to determine if WormBase data can be updated per release
- Has text mining
- Cross-species comparisons available
April 16, 2015
Topic Images
- Daniela importing Lipid Metabolism topic images
- Daniela may also try to create gene-image connections
- Picture OA could capture relevant genes
- For now, Daniela will only import images for which we have permissions
WormBook Chapters
- Would be good to make sure relevant FTP files are mentioned and pointed to
- We can add a section on concise description and automated description
- Not sure yet where to put it; maybe in introduction chapter
April 23, 2015
WormBook Chapters
- Importing figures or screenshots can result in lost resolution
- We may need to play with image manipulation/editing to fix
- Gene function chapter has intermine section; maybe should be an appendix or separate chapter
- Will make separate chapter; will reference Intermine with a single screenshot in gene function chapter
- We will make sure the gene function chapter properly links to GO chapter
Wikipathway
- Karen created Lipid metabolism pathway
- Dennis Kim and Jonathan Ewbank are updating their WormBook chapter on innate immunity; working with Karen on WikiPathway
LEGO
- Kimberly participating in conference calls
- Progress being made on back-end to handle evidence properly
- Curators have been going through specific use cases
- Talking about generally how to best model different biological scenarios
- Estimate for getting back end finished on the order of weeks
- Front-end still being developed; getting close
- Karen asks: How will LEGO integrate with Process&Pathway curation at WormBase?
- Kimberly: LEGO is very molecular-function-centric; may be more granular than WB Topics/Process&Pathway
- We should see soon how these two will interface
Controlled vocabulary for institutions
- How to handle synonyms of institution names
- We will accept entries of synonyms, but will only actually store/track the (English) official name
April 30, 2015
modSeek
- http://seek.princeton.edu/modSeek/
- Wen has established an account
- Wen is running scripts to generate worm data for modSeek
- Data in modSeek is G.E.O. (Gene Expression Omnibus) based
- Links at modSeek go to G.E.O. data, not WormBase
- Wen is considering a WB-specific search with links back to WB data
- modSeek only has (and supports) C. elegans data, not other nematodes for which we have data
- Maybe we could have a "WormBaseSeek"/"WB-Seek"?
- We can run a mirror at WB
- Wen is modifying SPELL scripts for modSeek
- Would we be able to perform cross-species comparisons with a WB-Seek?
- Textpresso-Dev is currently running our modSeek instance
- Textpresso-Dev probably would not be able to handle mouse, human, fly data for species comparisons etc.
- We would like to be able to perform cross-species comparisons for all nematode species as well as the other MODs/human
- WB data is paper-centric, unlike data in modSeek
- WB data is processed data (author processed), modSeek data is raw data
- We want to be able to keep/maintain the processed data in addition to the raw data
BioCurator Meeting 2015 (China)
- ~300 people in attendance, about 1/3 from outside China
- Mouse (MGI), yeast (SGD), EBI, NCBI, and WB represented at meeting
- Xiaodong's training session
- Q&A about literature curation
- Many small bio-databases in China, focused mainly on genome annotation analysis; no literature curation
- Xiaodong demonstrated the Ontology Annotator (OA) tool
- Currently there is no bio-data center in China; Biocuration society would like to establish one
- Web Apollo, JBrowse, phylogenetic G.O. annotation info presented
- Yeast database (SGD?) use IntAct (in collaboration with BioGrid) to curate protein complexes
- There was considerable discussion about the use of Uberon as a cross-species anatomy ontology
- Would be good for WB to establish connections to Uberon with C. elegans anatomy ontology
- This will require specification of relationships; non-trivial
- Yuling presented work on SVM analysis
- People had questions about whether SVM could be used for hierarchical flagging (data types and then subtypes)
- This requires training sets; could consider moving forward
- (Karen) We already run SVM for allele data followed by Textpresso entity recognition
- We would likely run the entire corpus through the SVM for a whole new datatype with proper training sets
- Can SVM be applied at the paragraph or sentence level?
- Possible, and we have some curated sentences saved, for example with cell component curation (CCC)
- People had questions about whether SVM could be used for hierarchical flagging (data types and then subtypes)
- UniProt developing UniRule system for large scale protein annotation
- Manually establish (text mining?) rules for future recognition and annotation
- Reactome is beginning to use ORCID person IDs to give attribution for Reactome pathways
FTE estimates for WB staff
- Paul S. sent around an e-mail with a spreadsheet for filling out FTE estimates
- Paul asks that people fill it out today and send it back to him
- Two e-mails; first with complete form, second with simpler form
- Please fill out complete form, or simpler form if necessary