Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
 
Line 17: Line 17:
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
  
= 2018 Meetings =
+
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
  
== January 4, 2018 ==
 
  
=== WS264 Upload ===
+
GoToMeeting link: https://www.gotomeet.me/wormbase1
* Citace upload to Wen, Tuesday January 16th, by 10am PST
 
* Upload to Hinxton on Jan 19th
 
  
=== Strain data import to AGR for disease ===
+
= 2020 Meetings =
* Will begin to consider pulling in strains into AGR
 
* Will need to think about how genotypes are built and stored at other MODs
 
* We should encourage authors to include strain IDs
 
* Diseases are annotated to genes, alleles, and strains within WB
 
  
=== Curating phenotypes and diseases to strains or genotypes ===
+
[[WormBase-Caltech_Weekly_Calls_January_2020|January]]
* Should we generate a ?Genotype class to capture genotypes without a known strain name? Or to capture relevant/relative genotypes thought to be responsible for a phenotype or disease?
 
* We could create un-named strain objects, that use a new unique identifier as a primary identifier and represent the entire genotype of a strain used
 
** Introduction of a new ?Strain class attribute of a unique serial identifier (like WBStrain00001) would be very costly to implement; would need to consider how crucial this is before implementing
 
** We can, instead, use new strain (public) names like "WBPaper00012345_Strain1", etc. instead of creating new unique ID attribute for un-named strains
 
* When curating phenotypes to strains, we will want to specify what is the relevant/relative genotype that is causative/correlated with the disease or phenotype observation
 
** Would be best if the specification of the relevant genotype used controlled vocabularies (when possible) and free text (when needed); would need to work out the logistics/mechanics of such curation
 
** Transgene-phenotype curation currently specifies causative gene, but would be more complicated for strains
 
* Alternatively, we could create the ?Genotype class to represent the abstract "relative"/"relevant" genotype thought to be responsible for the phenotype or disease, and annotate directly to that ?Genotype object
 
* ?Strain approach:
 
** Use strain if named (but important to know if the control strain is not simply N2)
 
*** If control strain is simply N2, causative genotype (and respective components) can be inferred from strain genotype
 
*** If control strain is not N2, causative genotype and components would need to be specified at the moment of phenotype/disease curation (by mechanism to be worked out)
 
** If no strain name provided, create "un-named" strain that contains the entire genotype provided by authors
 
*** Control strain issues above would still need to be addressed
 
* ?Genotype approach:
 
** ?Genotype class could represent individual instances of relevant/relative genotypes that are suggested to be causative for a disease or phenotype
 
** ?Genotype objects would be created with formal construction, with DB associations to each component object (e.g. alleles, transgenes, etc.) as well as free text descriptions (for components with no corresponding DB object)
 
** Such ?Genotype objects could be used repeatedly throughout a paper when applicable, but would likely not be used in any other papers (we would likely accumulate redundant objects in the DB)
 
* We may want to consider strains with same public name that have diverged
 
** Apply new strain names with prefixes/suffixes? Create new strain objects? Keep original?
 
* Need to determine how each AGR member DB curates phenotypes or diseases to genotypes: is each "genotype" a relative or absolute genotype?
 
  
 +
[[WormBase-Caltech_Weekly_Calls_February_2020|February]]
  
== January 11, 2018 ==
 
  
=== IWM swag ===
+
== March 5, 2020 ==
* Eppendorf tube openers with WormBase logo?
 
  
=== Update on AFP Form ===
+
=== WormBase curation ===
*[https://docs.google.com/presentation/d/1anFOFRK9Ida1UEvrXWf2OBAJ9F4xyU4lWJGy-qr6wRU/edit#slide=id.p3 In-progress mock-up of new form]
+
* Tim Schedl: WB needs to do more curation for phenotype and variation
*[https://docs.google.com/spreadsheets/d/1sS_uAjBJ2r5H90Lam62Ai0HunjwvfjnklkFNrDoNXeU/edit#gid=1929595460 Data type spreadsheet]
+
* Doing community curation, but not fast enough
*[http://wiki.wormbase.org/index.php/First-pass_flagging_pipelines#Author_first-pass_form_revisions Curation forms info]
+
* Would like to work on more automation; e.g. having Textpresso return all sentences mentioning an allele in papers that are predicted to have phenotype data
  
*Idea is to move from author flagging to author validation of text mining and data submission wherever possible
+
=== Alaksa ===
*Goal is to flag all data types in a paper and either curate at WB or share with a group that does curate that data
+
* Joesph may be able to meet with Raymond
*SVM flags and author flags can/will be used as filters in TPC
+
* How can the tool be maintained? If it can't it needs to be pulled
*Provide examples of what we want for each type of data to help avoid confusion
 
* Recognize entities automatically and show list to author
 
** Species, strains, genes, alleles, transgenes, etc.
 
** Ask to verify or add unrecognized
 
** Could show known/existing objects with checkboxes
 
** Possibly include unrecognized pattern matching objects? Ask author to verify if these are real?
 
** For strains:
 
*** Show recorded genotype for verification; maybe ask to update/modify if needed?
 
** For transgenes:
 
*** When author submits new transgene, send them to a transgene form, or send them an email asking for details?
 
*** Form could be for both strain and transgene
 
* Mapping data: still ask for? Maybe for balancers, but no one is reporting that. Could still ask if there's interest
 
* Maybe provide option for author to save their progress and return to the form later
 
* Phenotypes
 
** Ask for allele, RNAi and overexpression phenotypes with links to Phenotype form
 
** Also ask for drug/chemical and environmental perturbations (call treatment?); store as free text for now, accommodate with new data model when available
 
* Gene site- and time-of-action, mosaic
 
** Appears to be confusion from authors about mosaics. Should we keep this?
 
** Will keep gene site-of-action and time-of-action; leave unchecked (no SVM, yet) but allow users to indicate
 
* Cell and anatomy data
 
** Cell function ("Cell ablation (laser/genetic) data, optogenetics")
 
** Ultrastructural analysis
 
* Interaction data
 
** Genetic interactions
 
** Physical interactions
 
** Functional complementation
 
* Comparative genomics
 
* Gene expression & regulation
 
  
 +
=== Single cell expression tool ===
 +
* Tool is here: http://de.wormcells.com/
 +
* What do we need to consider for pulling into WormBase?
 +
* Where does the data sit and how does the tool access it? Can we integrate the data more into WormBase (e.g. local curation databases)?
 +
** Would want it to be in sync with WormBase releases
 +
** May not need to reprocess with every release with current (small-ish) data set
 +
* Fully integrated tool versus an embedded tool?
  
 +
=== Name service updates ===
 +
* For now we will stick to the CGIs that Juancarlos has built for variations and strains
 +
* Juancarlos can discuss with Matt R. and Sibyl the possibility of integrating API requests to the Name service into a CGI form for the OA
  
== January 18, 2018 ==
 
  
=== WormBase Tutorials ===
+
== March 12, 2020 ==
* May be good to get (possibly anonymous) written questions or suggestions after presenting
 
* Wen will have Skype call with Yishi Jin
 
* Micropublications
 
** how do we peer-review single experiment? No supporting information to corroborate a larger story
 
** Is the greater benefit of peer-review that the whole story is assessed by reviewers
 
** Do MPs help or hurt reproducibility?
 
** Larger papers may have lots of poor experiments that don't get much attention but still pass peer review
 
** Dedicated peer review on single experiment may be more rigorous
 
** What are the criteria/minimal requirements to micropublish?
 
* Concise descriptions
 
** SimpleMine has multiple descriptions output; people asked about the different types
 
** Yishi Jin suggested that we remind users to update manually written descriptions
 
** Showing last-updated date is important
 
** Automated descriptions relies on primary data; will rely on forms and community submissions
 
** Microreviews? Would want to guide authors what data we want; provide a template?
 
* Public/community education issues
 
** Users shouldn't assume that WormBase is comprehensively up to date
 
* Wen will also present at MidWest meeting (Ann Arbor, MI) in April and Boulder, Colorado in May
 
** Will assess topic interest ahead of time
 
  
=== New Cytoscape display for interactions ===
+
=== Cancelled meetings ===
* Sibyl developed a new Cytoscape display for interactions, now live with WS262 release
+
* Meetings are being cancelled, including Biocuration 2020. Ranjana will update our Wormbase meeting page with a note to users (done)
* Simplified colors and subtypes
+
* We should start doing online advertisement for micropublications with webinars
* Redraw button to clean up the graph based on what you want to see
 
* Play around and let Sibyl and/or Chris know about issues
 
  
 +
=== Storing invalid/avoided WBPersons and email addresses in Postgres ===
 +
* Currently Chris has been maintaining ([https://docs.google.com/spreadsheets/d/1FHhQk_IZIBLYkOUdVf9Kfh5zx66rkHOEVFQQ6wzT2ks/edit?usp=sharing on Google sheets]) a list of papers, people, and email addresses to omit from future WormBase outreach requests
 +
* Valerio would like to add these to Postgres to keep a more central repository of them
 +
* Chris would still like to be able to readily update/edit the content of those lists
  
== January 25, 2018 ==
+
* Proposed solutions: Keep the list in a flat file and have a cron job to sync the data to postgres daily or create a simple form to create/modify entries directly in postgres
  
=== UCSF visit report ===
+
=== Updates on Alaska ===
  
Questions from the audience
+
Raymond and Eduardo met with Joseph and decided to have the tool running as is for now even though maintaining it can be hard
*Is there a way in WB to pull out verified CRISPR guides?
+
The future plan is to move the platform to Google colab to reduce maintainance work.
*Single cell RNAseq from Waterson lab in WB? Paper is in WB, person-paper connections might be on staging, Kimberly verifying this. Gary W will put data in WS264. Comments from Gary on this paper:
 
‘The main problem is that the authors can sort the cells into groups (corresponding to tissues, I think) and sometimes sort them into single cells, but it is hard to identify the tissues or cells. I think they found 29 groups, of which about 20 appear to be single cells.
 
I think they have a website where they invite other researchers to make suggestions about how to identify cells in their data.
 
Currently I regard this data set as still undergoing analysis and I'm waiting to see if they improve the deconvolution and identification of the cells.
 
The RNASeq data from this paper will be going into the current Build (WS264) but I will not be adding it to SPELL this Build because displaying it probably requires more thought.
 
I'm not sure that tools for displaying single-cell data have been developed very much yet. There is potentially a lot of information if eventually all 959 or 1031 somatic cells are displayed!
 
  
*List of fem-3 alleles excluding natural variants in WormMine, how to do that? The template should be fixed (checkbox)
+
==Nameservice discussion==
*Is the increase of published paper due to an increased number of labs? Expansion of the field? Is there a correlation between the increased number of publications and increased number of labs
+
*Getting a token has worked for Chris and Ranjana
*One user found powerful to be able to use the Galaxy server to do analysis after exporting data from WormMine
+
*Karen and Daniela need to get tokens
*Can you do protein domain analysis with WormMine? Is the protein->motif precanned query the best option?
 
*Enrichment: how to see which genes are expressed in a tissue or cell -> pointed user to the ontology browser
 
  
=== Skype call with Yishi Jin and Sreekanth Chalasani ===
+
==Noctua GO-CAM updates==
Participants: Daniela, Karen, Wen, Jin, Shrek
+
*Noctua is a production tool available for annotations right now
 +
*Create as many interconnected models and make available for curators is the ongoing goal
 +
*Reactome models are in the process of being imported
 +
*Real time validation messages as curation is being done, is being worked on
 +
*Kimberly will send out the details for logging into Noctua and make sure all curators are on the login list
  
*Not all images available due to publisher agreement -> not clear to users, we should put a disclaimer somewhere
+
 
*Shrek: thinks gene expression display should improve => hard to figure out all expression patterns
+
== March 19, 2020 ==
**Shrek and Jin feel the most important info is the reagent, and that should be displayed on the gene page/expression widget
+
 
*Images are identical -> example on the eat-4 expression page. This is normal since one image depicts multiple neurons and the association Anatomy neuron applies.
+
=== Citace Upload ===
http://www.wormbase.org/species/c_elegans/gene/WBGene00001135#1860--10
+
* Tuesday, March 31st
**On the eat-4 expression page the problem is amplified as one picture shows 70+ neurons
+
 
**We should remove the image column and have links to images only on the panel above (as currently displayed)
+
=== Latest ACEDB ===
**The image column should be replaced with reagent and description, if possible. Will need to talk to Sibyl and see what is duable.  
+
* Getting latest ACEDB build from staging FTP (for descriptions, etc.)
*Shrek pointed out that the eat-4 concise description is out of date, we explained that in the near future the manual descriptions will be superseded by automatic descriptions
+
 
*Neuron connectivity: missing neuron connectivity pages, there is a new reconstruction of neuroanatomy from david hall, would be great to integrate
+
=== Storing email addresses/persons to omit from requests ===
 +
* Chris and Juancarlos will work on a form to submit email addresses and WBPerson IDs to Postgres
 +
 
 +
=== Mailing lists ===
 +
* Let Todd know if you want to keep your caltech.edu account on the mailing lists
 +
 
 +
 
 +
==March 26th==
 +
===Community curation===
 +
*Do we need to publicize any data form and urge Users to contribute? The AFP and phenotype forms already send targeted e-mail
 +
* Paul S. is getting people from the lab to do phenotype curation; Chris G will run a tutorial for interested people
 +
* Blog about phenotype form to make people generally aware
 +
* Could the phenotype form be adapted for other species? Possibly but depends on:
 +
** Whether genotypes could be loaded and recognized
 +
 
 +
=== AFP for older papers ===
 +
* Discussed recently
 +
* Many (all?) Old papers have already gone through the old AFP pipeline
 +
* Need to check what has already been curated
 +
 
 +
=== Omit form ===
 +
* http://tazendra.caltech.edu/~postgres/cgi-bin/omit_form.cgi
 +
* Can view, add, or edit persons, email addresses, and IP addresses to omit from email requests (and IP addresses to block)
 +
 
 +
=== Problems accessing Textpresso Central ===
 +
* Ranjana had noticed that the server was acting slow
 +
* Unknown problem; hard to pin down
 +
* Definitely a Caltech network system
 +
* Issue for past couple of weeks
 +
* May be able to ask Caltech IMSS for a diagnostic analysis
 +
* Bad switch? Maybe; there is an unused switch that could be used to test
 +
** Maybe should dedicate an ethernet socket to the server(s) (avoiding switch altogether)
 +
* Tazendra is good, mangolassi seems to have had some trouble recently as well
 +
* All on subnet 52; Raymond has been monitoring and seems to be working well; could suggest a local (e.g. computer-specific, switch-specific) issue
 +
 
 +
=== Automated concise pathway descriptions? ===
 +
* Could we generate pathway descriptions based on, for example, GO-CAM models?
 +
* Might be straightforward if clear guidelines can be documented (important players/genes, important/central functions/processes)
 +
 
 +
=== Genotype OA on Sandbox ===
 +
* http://mangolassi.caltech.edu/~postgres/cgi-bin/oa/ontology_annotator.cgi
 +
* Take a look and send feedback to Juancarlos and Chris
 +
 
 +
=== Name Service temp submission for strain and variation to Postgres ===
 +
* Sandbox: http://mangolassi.caltech.edu/~postgres/cgi-bin/temp_objects.cgi
 +
* Adding strains or variations will create the object in the Name Service as well as add it to Postgres
 +
* Test it and make sure it works; NOTE: Creating objects on both Mangolassi AND Tazendra will create real object entries in the Name Service; if testing on the Sandbox, make sure to tell Paul Davis and Matt Russell to remove test submissions from the Name Service

Latest revision as of 00:44, 27 March 2020

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings


GoToMeeting link: https://www.gotomeet.me/wormbase1

2020 Meetings

January

February


March 5, 2020

WormBase curation

  • Tim Schedl: WB needs to do more curation for phenotype and variation
  • Doing community curation, but not fast enough
  • Would like to work on more automation; e.g. having Textpresso return all sentences mentioning an allele in papers that are predicted to have phenotype data

Alaksa

  • Joesph may be able to meet with Raymond
  • How can the tool be maintained? If it can't it needs to be pulled

Single cell expression tool

  • Tool is here: http://de.wormcells.com/
  • What do we need to consider for pulling into WormBase?
  • Where does the data sit and how does the tool access it? Can we integrate the data more into WormBase (e.g. local curation databases)?
    • Would want it to be in sync with WormBase releases
    • May not need to reprocess with every release with current (small-ish) data set
  • Fully integrated tool versus an embedded tool?

Name service updates

  • For now we will stick to the CGIs that Juancarlos has built for variations and strains
  • Juancarlos can discuss with Matt R. and Sibyl the possibility of integrating API requests to the Name service into a CGI form for the OA


March 12, 2020

Cancelled meetings

  • Meetings are being cancelled, including Biocuration 2020. Ranjana will update our Wormbase meeting page with a note to users (done)
  • We should start doing online advertisement for micropublications with webinars

Storing invalid/avoided WBPersons and email addresses in Postgres

  • Currently Chris has been maintaining (on Google sheets) a list of papers, people, and email addresses to omit from future WormBase outreach requests
  • Valerio would like to add these to Postgres to keep a more central repository of them
  • Chris would still like to be able to readily update/edit the content of those lists
  • Proposed solutions: Keep the list in a flat file and have a cron job to sync the data to postgres daily or create a simple form to create/modify entries directly in postgres

Updates on Alaska

Raymond and Eduardo met with Joseph and decided to have the tool running as is for now even though maintaining it can be hard The future plan is to move the platform to Google colab to reduce maintainance work.

Nameservice discussion

  • Getting a token has worked for Chris and Ranjana
  • Karen and Daniela need to get tokens

Noctua GO-CAM updates

  • Noctua is a production tool available for annotations right now
  • Create as many interconnected models and make available for curators is the ongoing goal
  • Reactome models are in the process of being imported
  • Real time validation messages as curation is being done, is being worked on
  • Kimberly will send out the details for logging into Noctua and make sure all curators are on the login list


March 19, 2020

Citace Upload

  • Tuesday, March 31st

Latest ACEDB

  • Getting latest ACEDB build from staging FTP (for descriptions, etc.)

Storing email addresses/persons to omit from requests

  • Chris and Juancarlos will work on a form to submit email addresses and WBPerson IDs to Postgres

Mailing lists

  • Let Todd know if you want to keep your caltech.edu account on the mailing lists


March 26th

Community curation

  • Do we need to publicize any data form and urge Users to contribute? The AFP and phenotype forms already send targeted e-mail
  • Paul S. is getting people from the lab to do phenotype curation; Chris G will run a tutorial for interested people
  • Blog about phenotype form to make people generally aware
  • Could the phenotype form be adapted for other species? Possibly but depends on:
    • Whether genotypes could be loaded and recognized

AFP for older papers

  • Discussed recently
  • Many (all?) Old papers have already gone through the old AFP pipeline
  • Need to check what has already been curated

Omit form

Problems accessing Textpresso Central

  • Ranjana had noticed that the server was acting slow
  • Unknown problem; hard to pin down
  • Definitely a Caltech network system
  • Issue for past couple of weeks
  • May be able to ask Caltech IMSS for a diagnostic analysis
  • Bad switch? Maybe; there is an unused switch that could be used to test
    • Maybe should dedicate an ethernet socket to the server(s) (avoiding switch altogether)
  • Tazendra is good, mangolassi seems to have had some trouble recently as well
  • All on subnet 52; Raymond has been monitoring and seems to be working well; could suggest a local (e.g. computer-specific, switch-specific) issue

Automated concise pathway descriptions?

  • Could we generate pathway descriptions based on, for example, GO-CAM models?
  • Might be straightforward if clear guidelines can be documented (important players/genes, important/central functions/processes)

Genotype OA on Sandbox

Name Service temp submission for strain and variation to Postgres

  • Sandbox: http://mangolassi.caltech.edu/~postgres/cgi-bin/temp_objects.cgi
  • Adding strains or variations will create the object in the Name Service as well as add it to Postgres
  • Test it and make sure it works; NOTE: Creating objects on both Mangolassi AND Tazendra will create real object entries in the Name Service; if testing on the Sandbox, make sure to tell Paul Davis and Matt Russell to remove test submissions from the Name Service