Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
(33 intermediate revisions by 5 users not shown)
Line 33: Line 33:
 
[[WormBase-Caltech_Weekly_Calls_April_2021|April]]
 
[[WormBase-Caltech_Weekly_Calls_April_2021|April]]
  
== May 27, 2021 ==
+
[[WormBase-Caltech_Weekly_Calls_May_2021|May]]
  
=== Single cell ===
+
[[WormBase-Caltech_Weekly_Calls_June_2021|June]]
* Eduardo and Valerio will give an update on single cell analysis and visualization tools
 
  
  
== May 13, 2021 ==
+
== July 1, 2021 ==
  
=== Textpresso supplement ===
+
=== Importing genes for tm alleles from GeneACE ===
* Due Monday
+
* https://github.com/WormBase/website/issues/8262
* Michael working with Paul S
+
* Nightly dump currently excludes tm allele genes
 +
* Most tm (Mitani) alleles are not being manually connected to specific genes in GeneACE
 +
* Should pull the data from WS release (into Postgres) after the build has mapped the alleles to genes
 +
* ~100,000 alleles in Postgres; ~70,000 don't have a gene connection
 +
* Would Hinxton be willing to take WS-mapped allele-to-gene associations and populate GeneACE with associations not already in there?
  
=== AWS credits ===
+
=== Citace upload ===
* Michael and Valerio were awarded AWS credits, more than they can use
+
* Curators upload files to Spica for citace upload on Tuesday (July 6)
* Maybe they can be repurposed
 
* Valerio will play around with AWS to determine the best/cheapest configuration before migrating to the Alliance
 
  
=== Automated gene descriptions ===
+
=== Chen B1 kitchen Usage Considerations ===
* Will the Alliance ever accommodate non-elegans worm species? Can we port over the computed/derived descriptions for non-elegans species to the Alliance?
+
* Clean up after oneself.
* Maybe have clade-specific descriptions based on the popular model (worms based on C. elegans); may be provided in MOD portal page(s)
+
* Mark food storage with name and date.
* May be the focus of an Alliance supplement
+
* Mark storage drawers
* We want a flexible pipeline that can be configured depending on availability of data (e.g. protein domains)
+
* Consumables
  
=== IWM 2021 WB Workshop ===
 
* Scheduled for June 22, 2021
 
* Session begins at 8:30am Pacific / 11:30am Eastern / 4:30pm UK
 
* Workshop runs for 90 minutes: 4 15-minute talks followed by 30 minute Q&A session
 
* Here is the submitted workshop schedule:
 
11:30 am (EDT) Magdalena Zarowiecki, EMBL-EBI, A whistle-stop tour of all the types of data you can find in WormBase
 
11:45 am (EDT) Chris Grove, California Institute of Technology, Researching transcriptional regulation using WormBase transcription factors, TF binding sites and the modENCODE data
 
12:00 pm (EDT) Ranjana Kishore, California Institute of Technology, Comparative genomics and disease research using Alliance of Genome Resources
 
12:15 pm (EDT) Daniela Raciti, California Institute of Technology, How can you contribute? Community curation and tools, and the author-first-pass (AFP) pipeline
 
12:30 pm (EDT) Chris Grove, California Institute of Technology, Open Discussion / Q & A
 
  
 +
== July 8, 2021 ==
  
== May 20, 2021 ==
+
=== Alliance work ===
 +
* Orange team presenting initial plans at Alliance PI meeting tomorrow
 +
* What working groups are still meeting? What are their responsibilities?
 +
** Expression
 +
** Variants
 +
** Disease & Phenotype
 +
** Technical working groups
 +
*** Technical call
 +
*** Data quartermasters
 +
*** DevOps
 +
* Expression working group working on LinkML model with Gil (FB)
 +
** Includes work on antibody class, image class, movie class
 +
** Are species-specific anatomy ontologies being utilized for expression annotations or still just Uberon?
 +
* Creating a curation interface/tool:
 +
** Will require loading auxiliary data types in addition to primary data types (e.g. if we are focused on disease annotation curation, we will need to load genes, alleles, strains, etc. in addition to the disease annotations themselves) to be available to make connections to
 +
** One requirement already expressed by curators is the need to generate new entities, like alleles, and have them quickly (immediately?) available for use in curation
 +
** Maybe this could be handled by an Alliance central name server that mints new IDs (Alliance IDs and maybe also MOD IDs?) for the objects to make them available (also with a mechanism for these new objects/IDs to make their way back to the MODs as well)
 +
** Micropublication curation forms have tackled a lot of issues of collecting lexica and entity names and IDs; should consider work already done with Micropubs
  
=== Ontology updates in OA ===
+
== July 15, 2021 ==
* ODK pipeline is not updating the "date" line in OBO artifacts and thus the ontology is not updating
 
* Need to remove the "date" from the anatomy ontology OBO file header and let the OA script use the "data-version" line instead
 
* Life stage still has "date"; Chris will investigate if there are update issues for life stage
 
* Juancarlos has updated the GO URIs from Kimberly's suggestions; use monthly release URI or ~daily snapshot URIs?
 
** Should probably go with daily snapshots for curation purposes (frequent updating for new/deprecated terms)
 
  
=== OA updates for UTF-8 ===
+
=== hlh-34 expression ===
* Daniela and Juancarlos have updated OA dumpers (antibody, expression, picture, movie) accordingly
+
* Rebecca Mcwhirter (Miller lab) contacted WB saying that the annotations to AVJ for hlh-34 are incorrect. 4 evidences list AVJ -> the authors of the first paper that describes hlh-34 expression (Cunningham et al, 2012 : http://dx.doi.org/10.1016/j.cmet.2012.05.014) had to pick one neuron per reviewer's request. The neuron should instead be AVH.
* Others need to be updated:
+
* Oliver is putting together a micropub to clarify the issue
** Chris will help with Interaction OA, GeneReg OA, Phenotype OA, RNAi OA
+
* How to deal with existing  annotations? Add a comment in the remarks that points to the microPub? remove AVJ from the anatomy association list?
** Ranjana will help with Disease, Genotype and Concise OAs
+
* Should we also add public comments to the relevant papers?
** Karen will help with Construct, Transgene, Topic, Molecule OAs
 
* Anatomy function form (Valerio wrote) writes to Postgres; need some updates
 
  
=== Student Internship ===
+
== July 22, 2021 ==
* Student will work with Raymond, Kimberly to do anatomy function & Noctua/GO-CAM curation on dauer development
 
* To do anatomy function curation in Noctua, requirements will need to be written out and addressed
 
* Can the anatomy function form write directly into Noctua (eventually)?
 
  
=== AI-readiness supplement ===
+
=== Copying data from textpresso-dev to tazendra ===
* Valerio looking into AI-ready format input
+
* Michael has been asking curators to retrieve any data they might still want on textpresso-dev before the machine dies
* Looking into applying to neural circuits
+
* Can we copy files to tazendra?
 
+
* If so, do we need to have a more general approach/strategy other than creating new folders in the individual curator directories?
 
+
* Are there any size considerations for what we copy over? There are 1.1T free in /home2 which is not backed up
 
 
== May 27, 2021 ==
 
 
 
=== CeNGEN data ===
 
* CeNGEN project wants to generate expression displays like our FPKM plots (made by Gary Williams)
 
* Can we generate plots like these with this new data?
 
* Maybe we can just give the data directly to the web team for display, rather than load all the data into Citace/ACEDB (introducing heavy load to system) and have it pass through the whole build pipeline
 
* Does the CeNGEN data get preferential treatment and display, like 'Featured Data' (at least for the near term)? Or does it get integrated with all other expression data equally? Should discuss with Todd/Webteam
 
* Raymond et al. will communicate with Todd and web team about how to handle the data (will also start a ticket)
 
 
 
=== Help desk AFP request ===
 
* Kimberly will send out the AFP form for the paper when
 
** https://github.com/WormBase/website/issues/8243
 
 
 
=== Single-cell data tools ===
 
* Interactive differential expression, gene abundance histograms, heatmaps & dot plots, swarm plots
 
* Split into two code bases: scdefg and wormcells-viz
 
* wormcells-viz: http://cervino.caltech.edu:3000/
 
* Eduardo has wrangled the data into a standard format (.h5ad) (only 10x data sets)
 
* Interactive differential expression demo: improvement
 
** Now users can choose which aspects to stratify the data by
 

Revision as of 18:00, 22 July 2021

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings

2020 Meetings

2021 Meetings

January

February

March

April

May

June


July 1, 2021

Importing genes for tm alleles from GeneACE

  • https://github.com/WormBase/website/issues/8262
  • Nightly dump currently excludes tm allele genes
  • Most tm (Mitani) alleles are not being manually connected to specific genes in GeneACE
  • Should pull the data from WS release (into Postgres) after the build has mapped the alleles to genes
  • ~100,000 alleles in Postgres; ~70,000 don't have a gene connection
  • Would Hinxton be willing to take WS-mapped allele-to-gene associations and populate GeneACE with associations not already in there?

Citace upload

  • Curators upload files to Spica for citace upload on Tuesday (July 6)

Chen B1 kitchen Usage Considerations

  • Clean up after oneself.
  • Mark food storage with name and date.
  • Mark storage drawers
  • Consumables


July 8, 2021

Alliance work

  • Orange team presenting initial plans at Alliance PI meeting tomorrow
  • What working groups are still meeting? What are their responsibilities?
    • Expression
    • Variants
    • Disease & Phenotype
    • Technical working groups
      • Technical call
      • Data quartermasters
      • DevOps
  • Expression working group working on LinkML model with Gil (FB)
    • Includes work on antibody class, image class, movie class
    • Are species-specific anatomy ontologies being utilized for expression annotations or still just Uberon?
  • Creating a curation interface/tool:
    • Will require loading auxiliary data types in addition to primary data types (e.g. if we are focused on disease annotation curation, we will need to load genes, alleles, strains, etc. in addition to the disease annotations themselves) to be available to make connections to
    • One requirement already expressed by curators is the need to generate new entities, like alleles, and have them quickly (immediately?) available for use in curation
    • Maybe this could be handled by an Alliance central name server that mints new IDs (Alliance IDs and maybe also MOD IDs?) for the objects to make them available (also with a mechanism for these new objects/IDs to make their way back to the MODs as well)
    • Micropublication curation forms have tackled a lot of issues of collecting lexica and entity names and IDs; should consider work already done with Micropubs

July 15, 2021

hlh-34 expression

  • Rebecca Mcwhirter (Miller lab) contacted WB saying that the annotations to AVJ for hlh-34 are incorrect. 4 evidences list AVJ -> the authors of the first paper that describes hlh-34 expression (Cunningham et al, 2012 : http://dx.doi.org/10.1016/j.cmet.2012.05.014) had to pick one neuron per reviewer's request. The neuron should instead be AVH.
  • Oliver is putting together a micropub to clarify the issue
  • How to deal with existing annotations? Add a comment in the remarks that points to the microPub? remove AVJ from the anatomy association list?
  • Should we also add public comments to the relevant papers?

July 22, 2021

Copying data from textpresso-dev to tazendra

  • Michael has been asking curators to retrieve any data they might still want on textpresso-dev before the machine dies
  • Can we copy files to tazendra?
  • If so, do we need to have a more general approach/strategy other than creating new folders in the individual curator directories?
  • Are there any size considerations for what we copy over? There are 1.1T free in /home2 which is not backed up