WormBase-Caltech Weekly Calls August 2015
- 1 August 6, 2015
- 2 August 13, 2015
- 3 August 20, 2015
- 4 August 27, 2015
August 6, 2015
- prioritize new data types into WormMine
- RNAi phenotype, interactions, human disease...
- WormMine wiki page: http://wiki.wormbase.org/index.php/WormMine
- Wen wants to use the machine when WormMart retires
UniProt/wormbase gene class
- need to talk to UniProt C.elegans curator
Raymond, Chris and Juancarlos are working on phenotype viewer
James: list of genes, enrich in what tissues
- python code
- biotype ontology, tissue expression from postgres as input
August 13, 2015
We are really sad about our friend Bill Gelbart
We will try to help FlyBase as much as possible
Phenotype term annotation summary graph
Goal: Provides an ontology-relationship-aware summary view of a gene's phenotype annotations. Prototype link aex-3 (fewer phenotypes) existing phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000086#-b-3> summary graph <http://22.214.171.124/~azurebrd/cgi-bin/amigo.cgi?action=annotSummaryGraph&focusTermId=WBGene00000086>
daf-2 (lots of phenotypes) phenotype widget <http://www.wormbase.org/species/c_elegans/gene/WBGene00000898#-b-3> summary <http://126.96.36.199/~azurebrd/cgi-bin/amigo.cgi?action=annotSummaryGraph&focusTermId=WBGene00000898>
Proposed development procedure:
- standalone prototyping, commenting and improvements within the group.
- implementation as a widget on dev site (juancarlos.wormbase.org), more testing and soliciting comments from selected end users.
- committing to main site for general use.
Outline of graph processing:
To gather information:
- WOBr query to collect all phenotypes annotated to the gene of interest.
- WOBr query to collect all transitive relationships of the phenotypes from (1) towards the ontology root.
To simplify and to control graph size:
- Remove all nodes (phenotype terms) that are not directly annotated with or at branching points where two branches of annotations merge (LCA lowest common ancestor, if you will).
- Scale node size according to annotation count (includes inferred annotations). width of node is (scale multiplier of 1.5 * log(max count in graph)/log(current node) ) + minimum size value (.1)
- Limit appearance of label to nodes above a given size (roughly big enough to hold term name).
- Show annotation counts in mouse-over bubble, add hyperlink to term pages to each node
- key of red vs blue nodes
International Biocuration Conference
Propose to submit paper on Community Curation
- Mary Ann happy to lead.
- Daniela on board.
- push out the emails (Ranjana and Chris are thinking about it) Sept.1 so we can get some submissions in the Sept.
- We should visit labs to publicize Annotation. target labs with data
- target senior graduate students.
ModSeek and SPELL updates
- Wen is working with Seek developer to install a ModSeek mirror, still debugging the software.
- Wen will try to install a WormSPELL mirror on the Textpresso server because the machine still runs RedHat 5, which is compatible with the requirement of SPELL.
August 20, 2015
Phenotype annotation display
- New graph view available that emphasizes number of annotations
- Raymond and Juancarlos working on alternate displays to allow for rapid visual review of all annotations
- Color scheme (red/blue) not necessarily intuitive; maybe use dashed/bold lines for circles
- Including "Variant" root is good to show total annotations, maybe change shape to be distinct
- Will send around to users for testing/feedback before establishing a widget on the main page
WormSeek mirror running on Athena
- Do we still need another machine (Caprica)? Cannot run on Athena long term
- modSeek vs. SPELL: modSeek is easy to install but uses a lot of disk space and memory
- Probably need ~2 Terabytes disk space, ~16 Gigabytes of memory for WormSeek
Handling retracted paper
- Handle similarly to invalid papers?
- Full vs. partial retraction (e.g. one figure)
- Currently only 4 papers in the WB corpus are retracted
- We would want an alert to the whole group notifying of WBPaper retraction
- Is there a systematic way to find these papers?
- Is there a systematic way to deal with the data curated from papers that are then retracted?
- These papers were retracted, they both have curated data in WB attached to them:
WBPaper00042572 Leung, C. K., Wang, Y., Deonarine, A., Tang, L., Prasse, S., & Choe, K. P. (2013). A negative-feedback loop between the detoxification/antioxidant response factor SKN-1 and its repressor WDR-23 matches organism needs with environmental conditions. Mol Cell Biol, 33, 3524-37. doi:10.1128/MCB.00245-13
WBPaper00045362 Leung, C. K., Hasegawa, K., Wang, Y., Deonarine, A., Tang, L., Miwa, J., & Choe, K. P. (2014). Direct interaction between the WD40 repeat protein WDR-23 and SKN-1/Nrf inhibits binding to target DNA. Mol Cell Biol, 34, 3156-67.doi:10.1128/MCB.00114-14
- Should we handle these manually or automatically? Few enough that we can probably handle manually
- Indexing in PubMed? Could include as WormBook publications, since WormBook is indexed by PubMed
- Interest in general micro-publications at Caltech (in other fields)?
- Where (what datatypes) would people be willing to contribute micro-publications?
- How would the micro-publications be made visible?
- On WormBase
- Indexing by PubMed would be valuable
- Meetings coming up to discuss new management arrangement of FlyBase
- Meeting at end of October
August 27, 2015
- Wen spoke to developer of modSeek (will leave in February; would like to add features in the meantime)
- Another developer (Alicia?) will be working
- modSeek is still a work in progress; bugs being fixed, may be unstable at times
- SGD still running SPELL on RedHat 5 (can run until next year); not sure if it will get updated
- There is a dependency on the version of Ruby on Rails
- Could be a security concern if software is not kept up-to-date
- SPELL is distinct from modSeek; modSeek is not a replacement for SPELL
- Is there any true replacement for SPELL available? Will there be? Probably over time
PATO Entity-Quality style phenotype annotation
- Gary, Karen, Chris, and Juancarlos working on adapting the data models and Phenotype (and RNAi) OA for EQ pair style annotation
- Six entities to be included (each with their own distinct quality): Anatomy, Life Stage, (Endogenous) Molecule Affected, GO Process, GO Function, GO Component
- For the time being, we will keep the phenotype ontology and annotate to both EQ statements (when possible) and the phenotype ontology terms
- Eventually we will want an intelligent system to create (and maintain) granular phenotype terms with unique WBPhenotype IDs for each post-composed annotation
- Would be best if we could have (more-or-less) orthogonal axes for each ontology (anatomy, life stage, etc.) that can automate the relationships and annotation display heuristics
Phenotype ontology graphic display
- Raymond and Juancarlos are cleaning and vetting scripts
- Next step to work with Beta testers: IWM feedback, relevant users, survey responses, etc.
- Once we've accommodated feedback, we can move the display onto WormBase proper
- Once we've worked out the details of getting the display on WormBase, we can work on other ontology annotation displays
- Issues with developing on dev machines versus live site
- We should probably have dev machines run the display for short periods of time, specifically for Beta testing
- Caltech needs to talk to Todd about preferred protocol (place on development/staging instance before it's committed to go live?)
Gene-enrichment tool (based on anatomy annotation)
- David Angeles developed a gene-enrichment analysis tool based on expression annotations to anatomy terms
- Raymond continuing to work on
FlyBase Human Disease curation
- Xiaodong spoke to FlyBase disease curator
- Talking about connecting primary reagents used for disease models (e.g. a polylutamine antibody)
- Sibyl working on scripts to pull in data
- Xiaodong & Sibyl will ask Todd to accommodate the data (storage and processing)
- Juancarlos working with Kevin and Thomas on integrating/migrating ACEDB data (priority GeneAce) to Datomic