WormBase-Caltech Weekly Calls January 2021

From WormBaseWiki
Jump to navigationJump to search

January 28, 2021

String Matching Pipelines

  • Old pipelines on textpresso-dev are not compatible with the new TPC system
  • New TPC API does not support string matching
  • New Python library (wbtools) - used by variation pipeline - supports batch processing of WB literature and regex matching
  • Email extraction
    • No longer needed for concise description community curation tracker
    • Juancarlos, Valerio, and Chris will meet to establish a new, streamlined email address extraction pipeline
  • Old AFP Display CGI (http://tazendra.caltech.edu/~postgres/cgi-bin/author_fp_display.cgi)
    • Still uses old Textpresso-Dev; no longer needed? Probably not; Karen can look if there's anything there worth keeping (nothing critical)
  • Valerio will determine priorities (e.g. antibody stuff first), and send issues to curators as needed

Tracking interlibrary loan requests

  • Raymond: Do we have a common place to track interlibrary loan requests? Could be useful for Alliance/WB

Cengen data

Eduardo: I got the cengen 2020 data over christmas, and I have repackaged the full release data (all 100k cells with annotations, but prior to the soupX processing) in an h5ad file which I make available here https://wormcells.com/

  • This is the repo that makes that website: https://github.com/Munfred/wormcells-data
  • I figured a way to spin up the interface I made for doing differential expression through google colab, so now people can do it with any h5ad file they want. As an example I wrote a notebook that runs it with the 100k cells from cengen: https://colab.research.google.com/github/Munfred/scdefg/blob/main/scdefg.ipynb
  • Since they are thinking about simple things that can be offered in wormbase, I will also briefly talk about this dashboard that i made for a UCLA group that wanted to look at nuclei data. It uses the tissue enrichment analysis code for the bottom 3 plots.

Tracking corresponding authors for papers at Alliance

  • Corresponding authors not tracked in ACEDB, because authors are just text names not IDs
  • Maybe Cecilia could link a WBPerson as corresponding author for a paper during curation?

January 21, 2021

Neural Network (NN) Paper Classification Results

  • Linking to Paper Display tool (as opposed to Paper Editor) from Michael's webpage for NN results (Michael will make change)
  • NN results will be incorporated into the Curation Status Form
  • For AFP and VFP, there is now a table with mixed SVM and NN results ("blackbox" results); for a given paper, if NN results exist, they take priority over any SVM results
  • Decision: we will omit blackbox results (at least for now) from curation status form (just add the new NN results separately)
  • We have stopped running SVM on new papers
  • Interactions SVM has performed better than new NN results; would be worth attempting a retraining

Community Phenotype Curation

  • On hold for a few months to commit time to updating the phenotype annotation model to accommodate, e.g. double mutant phenotypes, multiple RNAi targets (intended or otherwise), mutant transgene products causing phenotypes, expressed human genes causing phenotypes, etc.
  • Changes made for WB phenotypes may carry over to Alliance phenotype work
  • Paper out now on undergrad community phenotype curation project with Lina Dahlberg; we may get more requests for trying this with other undergrad classes

AFP Anatomy Function flagging

  • Sometimes it is difficult to assess whether an author flag is correct (often times can be wrong/absent)
  • What about giving authors/users feedback on their flagging results?
  • Would be good to provide content from paper where this data is said to exist (automatically from a Textpresso pipeline or manually from author identified data)
  • We want to be careful about how we provide feedback; we should be proactive to make improvements/modifications on our end and bring those back to users for feedback to us

January 14th, 2021

PubMed LinkOut to WormBase Paper Pages (Kimberly)

  • Other databases link out from PubMed to their respective paper pages
  • For example, https://pubmed.ncbi.nlm.nih.gov/20864032/ links out to GO and MGI paper pages
  • Would like to set this up for WormBase and ultimately for the Alliance, but this will require some developer help
  • Work on this next month (after AFP and GO grant submissions)?

Update cycle for HGNC data in the OA (Ranjana)

  • Juancarlos had these questions for us:

There's a script here that repopulates the postgres obo_*_hgnc tables
based off of Chris and Wen's data
/home/postgres/work/pgpopulation/obo_oa_ontologies/populate_obo_hgnc.pl 

It's not on a cronjob, because I think the files are not updated that
often.  Do we want to run this every night, or run it manually when
the files get re-generated ?  Or run every night, and check if the
files's timestamps have changed, then repopulate postgres ?

Minutes

PubMed LinkOut to WormBase Paper Pages

Update cycle for HGNC data in the OA

  • We will update when Alliance updates the data
  • Juancarlos will set it to check the timestamps and if they change will do an update for the OAs

CENGEN

  • Wen, Daniela, and Raymond will look at the datasets to work out how to incorporate. Start simple.
  • We will make links to pages on their site.