Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
Line 39: Line 39:
 
[[WormBase-Caltech_Weekly_Calls_August_2018|August]]
 
[[WormBase-Caltech_Weekly_Calls_August_2018|August]]
  
 
+
[[WormBase-Caltech_Weekly_Calls_September_2018|September]]
== September 6, 2018 ==
 
 
 
=== Genotype class ===
 
* Chris started initial document to draw up ?Genotype class and make appropriate changes to ?Strain class
 
**  https://docs.google.com/document/d/19hP9r6BpPW3FSAeC_67FNyNq58NGp4eaXBT42Ch3gDE/edit?usp=sharing
 
* Would be good for people to look at so we can discuss next time
 
* Would also be good to have Kevin H take a look and provide feedback
 
 
 
=== Citace Upload ===
 
* Send citace files to Wen by Sept 18, 10am Pacific
 
 
 
=== Automated gene descriptions ===
 
* Group making improvements
 
* Added disease, protein domains
 
** When direct experimental evidence for disease relevance, will say "gene has been used to study"
 
* When minimal data (information-poor genes), we can refer to human ortholog and data stored for that human gene in Alliance
 
* Continue to receive feedback from users; include enrichment information, etc.
 
* Now have good trimming algorithms to retain important info without flooding a description with too many granular terms
 
* Will not store automated descriptions in Postgres
 
* Wen will modify SimpleMine scripts to accommodate change
 
* Would be good to write a paper on automated concise descriptions
 
* Came up at GO meeting/hackathon: Translating GO-CAM models into concise descriptions?
 
** Should be doable; just need to develop code when we're ready to do that
 
 
 
=== Textpresso presentation at next Alliance all-hands call ===
 
* Who should present? Maybe have several people? Kimberly, Valerio, Michael for sections?
 
* We can discuss at next Textpresso meeting
 
* Should cover techniques and software, but keep it generally simple and comprehensible for a larger audience
 
 
 
 
 
== September 13, 2018 ==
 
 
 
=== Citace upload ===
 
* Send files to Wen by 10am Tuesday (18th)
 
 
 
=== Genotype class ===
 
* [https://docs.google.com/document/d/19hP9r6BpPW3FSAeC_67FNyNq58NGp4eaXBT42Ch3gDE/edit?usp=sharing ?Genotype class proposal]
 
* Genotype_name tag: free text summary of the genotype
 
** Can this be automatically generated from components? Ideally, yes, but may be difficult
 
** Otherwise, can be manually written, as we have been doing, but is a bit denormalized and may require maintenance
 
* Genotype_description tag: free-text description of the genotype
 
** No precedent and probably not going to start now, so will remove from model
 
* Genotype_components supertag: to collect genotype component objects and, where necessary, free text
 
** We want to be able to express zygosity for each referenced object, likely requires a #Zygosity hash (and hence a ?Zygosity model)
 
** ?Zygosity model can have three main tags: "Homozygous", "Heterozygous_with_wild_type", or "Heteroallelic_combination_with"
 
*** Heteroallelic_combination_with tag could further specify the type and identity of the object that is in heteroallelic combination with the original object
 
*** Since it is not ideal to store an arbitrary component in the #Zygosity hash, we should probably just state the zygosity as "Heteroallelic_combination" and for display purposes have an automated way to calculate which components affect the same locus/loci (if necessary)
 
 
 
 
 
 
 
== September 20, 2018 ==
 
 
 
=== Kimberly's talk at Rutgers ===
 
* Kimberly went to worm meeting at Rutgers (10 labs using C. elegans? 6-7 totally worm-centric)
 
* 30-40 attendees (PIs, postdocs, grad students)
 
* Discussed tools and features at WB
 
* Presented Alliance pages and Textpresso
 
* PIs are enthusiastic about WB
 
* Monica Driscoll made a good plug for Textpresso
 
* People requesting FAQs and user guides (text and videos)
 
* Monica suggested a WB tutorial for PIs ;p
 
* Some people surprised about what they can accomplish using the tools available, like SimpleMine
 
* Covered gene set enrichment, WormMine, SPELL, SimpleMine, ParaSite BioMart, Textpresso
 
* Would be good to show people how to use Textpresso Central
 
* Discussed micropublications, asked about negative results (precedent?)
 
* Some asked about WB funding and Alliance plans
 
* Can we make a within-page search available to find, for example, field names etc.
 
* Some challenges in find genes/proteins of certain class
 
** Had question about histone genes recently
 
** Repeatedly have had questions about finding "ion channels"
 
** Searching gene class with text pulls out lots of false positives
 
** Could perform an analysis on particular classes of genes (e.g. histones or ion channels) and generate a micropublication providing the curated list
 
** Can generate a WormMine template query to pull these out for each release
 
** What classes of genes would we want to identify: histones, transcription factors, ion channels, protein kinases
 
** Chris will look into WormMine templates using gene class info and look into pulling in protein motif information
 
** We will ask other MODs and UniProt about how they deal with this issue
 
 
 
 
 
== September 27, 2018 ==
 
=== Update on the new AFP form and pipeline ===
 
*Daniela, KImberly, Juancarlos, and Valerio will update on the current status of the new AFP form and pipeline
 
**Overall, the goal has been to incorporate as much Textpresso-based entity and data-type flagging as possible into the form
 
**Move from author data flagging to author data validation wherever we can
 
**Provide opportunities for authors to submit more detailed curation if they want
 
* General: Positive thru SVM gets checked checkbox
 
* General: Question mark icons with help text
 
* Gene recognition
 
** Need to set a threshold of mentions; don't necessarily want all genes mentioned once
 
*** Can we show all genes, ranked by occurrence?
 
** Don't want to overwhelm users
 
** How are the genes identified? Via the Textpresso pipeline, string matching, consolidate multiple instances (protein, gene, etc.) into single gene result
 
** Searches include supplemental materials
 
** Cannot search by section of paper
 
** Can we identify genes other than C. elegans/worm? Are not doing now, and will stick to C. elegans for now
 
** Will expand to non-elegans nematodes in future; will expand to other species when extending to other MODs/Alliance members
 
** Chris: should we show the name of the gene as mentioned, verbatim, from the paper?
 
*** Karen: No, we should insist authors use the proper names
 
*** Chris: Meant referencing sequence names in paper, but public name comes out by the time AFP goes to authors, causing confusion
 
** Can we pull genes from tables? We are pulling from PDF tables, but not supplemental Excel tables, for example
 
* Gene model updates: checkbox yes/no
 
* Species in paper
 
** Including worm, mouse, human, yeast
 
** Still more work to do on this front
 
* Alleles recognized
 
** Show list of allele names and WBVar IDs for confirmation
 
** Can submit new alleles within the AFP form (just allele names, no genes or other info; keeping it simple)
 
* Allele sequence change checkbox yes/no (link to Allele sequence info form)
 
* Can there be a feedback option readily available? There is a comments section toward the end of the form under "Anything else?"
 
* Transgenes handled like alleles
 
* Antibodies
 
** Newly generated antibodies checkbox and text field (ask for details? consistency with alleles?) maybe shouldn't ask for antibody details; can make details optional
 
** Form for existing antibodies
 
* Expression data
 
** Anatomic expression in WT
 
** Site of action (may be difficult to interpret user input; ask for example; make text details required)
 
** Time of action
 
** RNAseq data
 
* Microarrays - just link out to GEO
 
* Interactions (all SVM based, three checkboxes)
 
* Phenotypes (SVMs, link to phenotype form)
 
* Disease
 
** Checkbox for worm orthologs of human disease gene, etc.
 
* Comments section (to point out missing data types, provide general comments on form)
 
** Ask for unpublished data and suggest micropublication
 
* Final thank you and update contact info and lineage
 
* CIT feedback
 
** Maybe make font size larger
 
** Mobile device compatible? Yes
 
** Change "Anything else?" to "Anything else? Comments?"
 
** Can people save and return later? Yes
 
*** How do we know they're finished? There is a "Finish and submit" button at end (but authors can still go back and make changes later)
 
*** Maybe move "Finish and submit" button to left panel so it is always visible? Maybe make the button stand alone?
 
** If authors indicate there are physical interactions, can we distinguish elegans-elegans interactions vs. non-elegans or interspecies interactions? No, we cannot yet distinguish
 
 
 
=== Genetics and G3 papers in Textpresso ===
 
* These papers don't get a PMID yet (when they first enter WB), only DOI (most of time DOI doesn't work (yet))
 
* DOI should work right away; Karen will look into if there's a problem/typo
 
* Daniel needs to keep track, go back and merge WBPapers once PMID goes live
 
* Kimberly or Karen may have to send papers directly to Daniel for uploading
 
* Should Daniel only download papers with a PubMed ID? Yes, except for micropublications?
 
* Need a separate pipeline for micropublications? Daniel is currently downloading the papers
 
 
 
=== ParaSite (non-elegans) papers ===
 
* Should Daniel be trying to download all of these papers? Many are hard to track down
 
* Daniel should ask Michael Paulini
 

Revision as of 15:27, 4 October 2018