WormBase-Caltech Weekly Calls
From WormBaseWiki
Contents
Previous Years
2021 Meetings
January 28th, 2021
January 14th, 2021
PubMed LinkOut to WormBase Paper Pages (Kimberly)
- Other databases link out from PubMed to their respective paper pages
- For example, https://pubmed.ncbi.nlm.nih.gov/20864032/ links out to GO and MGI paper pages
- Would like to set this up for WormBase and ultimately for the Alliance, but this will require some developer help
- Work on this next month (after AFP and GO grant submissions)?
Update cycle for HGNC data in the OA (Ranjana)
- Juancarlos had these questions for us:
There's a script here that repopulates the postgres obo_*_hgnc tables based off of Chris and Wen's data /home/postgres/work/pgpopulation/obo_oa_ontologies/populate_obo_hgnc.pl It's not on a cronjob, because I think the files are not updated that often. Do we want to run this every night, or run it manually when the files get re-generated ? Or run every night, and check if the files's timestamps have changed, then repopulate postgres ?
Minutes
PubMed LinkOut to WormBase Paper Pages
Update cycle for HGNC data in the OA
- We will update when Alliance updates the data
- Juancarlos will set it to check the timestamps and if they change will do an update for the OAs
CENGEN
- Wen, Daniela, and Raymond will look at the datasets to work out how to incorporate. Start simple.
- We will make links to pages on their site.
January 21, 2021
Neural Network (NN) Paper Classification Results
- Linking to Paper Display tool (as opposed to Paper Editor) from Michael's webpage for NN results (Michael will make change)
- NN results will be incorporated into the Curation Status Form
- For AFP and VFP, there is now a table with mixed SVM and NN results ("blackbox" results); for a given paper, if NN results exist, they take priority over any SVM results
- Decision: we will omit blackbox results (at least for now) from curation status form (just add the new NN results separately)
- We have stopped running SVM on new papers
- Interactions SVM has performed better than new NN results; would be worth attempting a retraining
Community Phenotype Curation
- On hold for a few months to commit time to updating the phenotype annotation model to accommodate, e.g. double mutant phenotypes, multiple RNAi targets (intended or otherwise), mutant transgene products causing phenotypes, expressed human genes causing phenotypes, etc.
- Changes made for WB phenotypes may carry over to Alliance phenotype work
- Paper out now on undergrad community phenotype curation project with Lina Dahlberg; we may get more requests for trying this with other undergrad classes
AFP Anatomy Function flagging
- Sometimes it is difficult to assess whether an author flag is correct (often times can be wrong/absent)
- What about giving authors/users feedback on their flagging results?
- Would be good to provide content from paper where this data is said to exist (automatically from a Textpresso pipeline or manually from author identified data)
- We want to be careful about how we provide feedback; we should be proactive to make improvements/modifications on our end and bring those back to users for feedback to us
January 28, 2021
String Matching Pipelines
- Old pipelines on textpresso-dev are not compatible with the new TPC system
- New TPC API does not support string matching
- New Python library (wbtools) - used by variation pipeline - supports batch processing of WB literature and regex matching
Cengen data
Eduardo: I got the cengen 2020 data over christmas, and I have repackaged the full release data (all 100k cells with annotations, but prior to the soupX processing) in an h5ad file which I make available here https://wormcells.com/
- This is the repo that makes that website: https://github.com/Munfred/wormcells-data
- I figured a way to spin up the interface I made for doing differential expression through google colab, so now people can do it with any h5ad file they want. As an example I wrote a notebook that runs it with the 100k cells from cengen: https://colab.research.google.com/github/Munfred/scdefg/blob/main/scdefg.ipynb
- Since they are thinking about simple things that can be offered in wormbase, I will also briefly talk about this dashboard that i made for a UCLA group that wanted to look at nuclei data. It uses the tissue enrichment analysis code for the bottom 3 plots.