Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
 
(78 intermediate revisions by 6 users not shown)
Line 22: Line 22:
  
 
[[WormBase-Caltech_Weekly_Calls_2020|2020 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2020|2020 Meetings]]
 
  
 
= 2021 Meetings =
 
= 2021 Meetings =
Line 30: Line 29:
 
[[WormBase-Caltech_Weekly_Calls_February_2021|February]]
 
[[WormBase-Caltech_Weekly_Calls_February_2021|February]]
  
 +
[[WormBase-Caltech_Weekly_Calls_March_2021|March]]
  
== March 4, 2021 ==
+
[[WormBase-Caltech_Weekly_Calls_April_2021|April]]
 
 
=== Webinar Monday ===
 
* Juancarlos can send a reminder email; he just needs the text for the body of the email message
 
* Can we send the webinar reminder to WB staff? Cannot use staff@wormbase.org to register
 
* Raymond will try to remember to forward to staff
 
 
 
=== Neural Network (NN) evaluation ===
 
* Kimberly (and other curators) looking through NN results
 
* Had previously prevented LOW scoring SVMs(?) from being sent to authors
 
* We need an agreed upon protocol for evaluation
 
* To avoid bias, we should randomly sample papers; usually will find negatives; how do we represent high, medium and low-scoring papers in the set without making curators review >> 100 papers?
 
* Michael will come up with protocol and send around list of papers for each data type (in Google Doc)
 
* Perhaps evaluation should happen on non-curated, newer papers
 
 
 
 
 
== March 11, 2021 ==
 
 
 
=== Spreadsheet for Data not in ACEDB ===
 
* Magdalena/Hinxton would like to know how much data at Caltech doesn't get into ACEDB
 
* Sheet here: https://docs.google.com/spreadsheets/d/1VcFykdyBcoMBvYliem8tnch5Q2VDMImL6EqlcEBCU-s/edit?usp=sharing
 
* We want to pull everything into the Alliance eventually, but maybe not everything needs to be harmonized
 
* We should evaluate existing forms and fields/tables for whether they need to be harmonized or can stay as is
 
 
 
=== CeNGEN To Dos ===
 
* Daniela, Wen, Valerio, Eduardo, Raymond met with CeNGEN
 
* Want to add JBrowse track for promoters(?)
 
* Three things:
 
** CeNGEN will provide histograms that we can put on each gene page, similar to modENCODE plots (do they provide static images or data that we process and display as histograms?) Both are possible (Raymond)
 
** Their tool can assess "enriched" genes by cell type (enriched vs. neurons or vs. all cell types; not just housekeeping genes); these can be sent to WB -> SObA and enrichment analysis
 
** We will link to the CeNGEN homepage wherever appropriate
 
* All else will depend on Eduardo's tools for single cell expression data
 
* Raymond will be WB point person to communicate with CeNGEN
 
* Display solutions can be used for Alliance single-cell data in general
 
* May want to consider future use cases, e.g. mutant-vs-WT expression, chemical/drug-induced expression, etc.
 
 
 
=== Chen building access ===
 
* You'll need an ID with RFID? (Yes, 20 year old IDs don't work)
 
* help.caltech.edu -> request type Card Office
 
* It will be ready and wait for you at the Reddoor cafe.
 
 
 
 
 
== March 18, 2021 ==
 
 
 
=== Caltech Alliance source? ===
 
* Could some data (like paper class/data) go to the Alliance directly from Caltech? Could be quicker and more efficient (and allow special characters that are lost at the acedb layer)
 
* Data would no longer be coming from the "Single Source of Truth" for WB data (i.e. ACEDB/Datomic)
 
* WB paper data would be ahead of the ACEDB paper data
 
* Kimberly will reach out to Magdalena et al. to propose
 
 
 
=== CITAce upload ===
 
* Upload to Hinxton on April 19 (? Friday April 16th ?)
 
* Upload to CITace for Wen on Friday before (April 9th) by end of the day
 
  
=== Alliance biological working groups priority ===
 
* Keep working on harmonization and LinkML models for data types
 
  
=== LinkML data visualization ===
+
== May 13, 2021 ==
* Is there a way to visualize data coming from LinkML models? Like in .ACE files?
 
* There may be some software that can render that kind of visualization, but we need to see
 
* Curators want a way to make sure the model and the data are correct before officially submitting
 
* Adam plans to demonstrate a visualization of literature data next Tuesday at literature acquisition working group (this was in the context of having a UI for seeing what's stored in the persistent database without waiting for elastic search processing to pass it to the regular UI, currently Literature is not modeled off of the LinkML yet)
 
  
=== QC analysis for steps in Alliance ingest pipeline ===
+
=== Textpresso supplement ===
* Curators can get access to the FMS to look at uploaded files, processed files; need to know how to process JSON files
+
* Due Monday
* Curators can also access Neo, but need to know how to query
+
* Michael working with Paul S
* Would be good to have readable reports to provide numbers for overall data sets
 
  
 +
=== AWS credits ===
 +
* Michael and Valerio were awarded AWS credits, more than they can use
 +
* Maybe they can be repurposed
 +
* Valerio will play around with AWS to determine the best/cheapest configuration before migrating to the Alliance
  
== March 25, 2021 ==
+
=== Automated gene descriptions ===
 +
* Will the Alliance ever accommodate non-elegans worm species? Can we port over the computed/derived descriptions for non-elegans species to the Alliance?
 +
* Maybe have clade-specific descriptions based on the popular model (worms based on C. elegans); may be provided in MOD portal page(s)
 +
* May be the focus of an Alliance supplement
 +
* We want a flexible pipeline that can be configured depending on availability of data (e.g. protein domains)
  
=== How to access data at the Alliance ===
+
=== IWM 2021 WB Workshop ===
* Google doc summary here: https://docs.google.com/document/d/1FvrsFHZ5ga5KzPtQCFJixSdulkOXJfAEjW-qJ4N34Gc/edit?usp=sharing
+
* Scheduled for June 22, 2021
* Alliance data pipeline (simple): DQMs and Ferret pipelines --> FMS API & FMS --> Loader --> Neo4J --> Java API --> Web Interface & Download Files
+
* Session begins at 8:30am Pacific / 11:30am Eastern / 4:30pm UK
* FMS Swagger UI: https://fms.alliancegenome.org/swagger-ui/index.html
+
* Workshop runs for 90 minutes: 4 15-minute talks followed by 30 minute Q&A session
* Peruse all data types: https://fms.alliancegenome.org/api/datatype/all
+
* Here is the submitted workshop schedule:
* Neo4J web browser:  
+
11:30 am (EDT) Magdalena Zarowiecki, EMBL-EBI, A whistle-stop tour of all the types of data you can find in WormBase
** Stage: http://stage.alliancegenome.org:7474/browser/
+
11:45 am (EDT) Chris Grove, California Institute of Technology, Researching transcriptional regulation using WormBase transcription factors, TF binding sites and the modENCODE data
** Production: http://www.alliancegenome.org:7474/browser/
+
12:00 pm (EDT) Ranjana Kishore, California Institute of Technology, Comparative genomics and disease research using Alliance of Genome Resources
* Alliance (Java) API Swagger UI: https://www.alliancegenome.org/api/swagger-ui/
+
12:15 pm (EDT) Daniela Raciti, California Institute of Technology, How can you contribute? Community curation and tools, and the author-first-pass (AFP) pipeline
* AGR Schemas repo: https://github.com/alliance-genome/agr_schemas
+
12:30 pm (EDT) Chris Grove, California Institute of Technology, Open Discussion / Q & A

Latest revision as of 18:31, 13 May 2021

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings

2020 Meetings

2021 Meetings

January

February

March

April


May 13, 2021

Textpresso supplement

  • Due Monday
  • Michael working with Paul S

AWS credits

  • Michael and Valerio were awarded AWS credits, more than they can use
  • Maybe they can be repurposed
  • Valerio will play around with AWS to determine the best/cheapest configuration before migrating to the Alliance

Automated gene descriptions

  • Will the Alliance ever accommodate non-elegans worm species? Can we port over the computed/derived descriptions for non-elegans species to the Alliance?
  • Maybe have clade-specific descriptions based on the popular model (worms based on C. elegans); may be provided in MOD portal page(s)
  • May be the focus of an Alliance supplement
  • We want a flexible pipeline that can be configured depending on availability of data (e.g. protein domains)

IWM 2021 WB Workshop

  • Scheduled for June 22, 2021
  • Session begins at 8:30am Pacific / 11:30am Eastern / 4:30pm UK
  • Workshop runs for 90 minutes: 4 15-minute talks followed by 30 minute Q&A session
  • Here is the submitted workshop schedule:
11:30 am (EDT) Magdalena Zarowiecki, EMBL-EBI, A whistle-stop tour of all the types of data you can find in WormBase
11:45 am (EDT) Chris Grove, California Institute of Technology, Researching transcriptional regulation using WormBase transcription factors, TF binding sites and the modENCODE data
12:00 pm (EDT) Ranjana Kishore, California Institute of Technology, Comparative genomics and disease research using Alliance of Genome Resources
12:15 pm (EDT) Daniela Raciti, California Institute of Technology, How can you contribute? Community curation and tools, and the author-first-pass (AFP) pipeline
12:30 pm (EDT) Chris Grove, California Institute of Technology, Open Discussion / Q & A