Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
Line 33: Line 33:
 
[[WormBase-Caltech_Weekly_Calls_May_2018|May]]
 
[[WormBase-Caltech_Weekly_Calls_May_2018|May]]
  
 +
[[WormBase-Caltech_Weekly_Calls_June_2018|June]]
  
== June 7, 2018 ==
 
  
=== Alliance literature pipeline working group ===
+
== July 5, 2018 ==
* Has anyone gotten back to Carol Bult about WB/Textpresso membership in the group? Not yet
 
* Paul & Kimberly can join; Kimberly will sit in on group meetings
 
  
=== SPELL migration ===
+
=== Community Curation Mass Email update ===
* SGD put SPELL into the cloud; will give us code to set up WB SPELL in cloud
 
* Want to have common interface but two instances (yeast & worm)
 
* Worm data in SPELL may be more complicated than the yeast data (different species, platforms, meta data, etc.)
 
* Mike Cherry proposed to use GTEx to replace SPELL. (www.gtexportal.org)
 
* Trying to get GTEX to have same functionality as SPELL
 
 
 
=== Progress report ===
 
* Wen can generate numbers of changes since last year
 
* 5-year progress report coming up (for funding agencies)
 
* Want a progress report-like document to give to SAB in July
 
* WS259 - WS265
 
 
 
=== Paper author name ===
 
* Jonathan Ewbank wrote in about an incorrect author name
 
* Issue for WBPaper00048704: https://wormbase.org/resources/paper/WBPaper00048704#0--10
 
* First author is "Li, C" as in PubMed (correct name) in WormBase as "Chun, L"
 
* Will make a GitHub ticket; web team may need to look at
 
 
 
=== Lab class ===
 
* Lab data is not clean/consistent; has gotten messy
 
* Cecilia trying to clean it up
 
* There is conflicting data from lab class, author class, and person class
 
* Currently there is a ~redundant curation pipeline; considering pulling person info into lab class
 
* When looking at lab page, if there is a problem with a person's info, changes would be made (requested) in the person class, not the lab class
 
* Cecilia will contact labs to ask about correctness of information
 
* Can have a discussion with Ann and Aric at CGC
 
 
 
=== How many C. elegans genes have "good" knockouts? ===
 
* Paul giving talk next week, wants to report
 
* Chris will look into
 
* Mitani did 1,000 deletions in last year
 
 
 
=== Server logs ===
 
* SimpleMine logs are all coming from Amazon; will need to ask Todd about SimpleMine usage (from AWS stats?)
 
 
 
=== Migration of lab data ===
 
* Data for labs was coming from Name Server (in Hinxton) - <font color= red>Comment:</font>[PD] Not coming from the Nameserver, geneace was the only source of Lab data prior to handover.
 
* Now Paul D has stopped curating lab info - <font color= red>Comment:</font>[PD] Build config has been updated to take the majority of lab data from citace rather than geneace
 
Database        File                                    Class                  remove/format
 
----------------------------------------------------------------------------------------------------------------
 
db=geneace      file=geneace_Laboratory.ace            class=Laboratory        format="Alleles WBVar\d{8}"    delete=Representative  delete=Registered_lab_members  delete=Past_lab_members delete=Allele_designation      delete=Strain_designation      delete=Address
 
db=citace      file=caltech_Laboratory.ace            class=Laboratory        delete=Alleles  format="Representative WBPerson\d{1,5}" format="Registered_lab_members WBPerson\d{1,5}" format="Past_lab_members WBPerson\d{1,5}"
 
 
 
* Now that Cecilia is curating, will pull lab info from Postgres
 
 
 
 
 
== June 21, 2018 ==
 
 
 
=== SAB ===
 
* Lightning talks at project meeting; everyone should consider what they want to present (5 minutes with 5 minutes for questions)
 
* 45 minutes curation talk to SAB; Chris volunteered; anyone else interested?
 
* What data types are being incorporated with the Alliance? What have we gained from those conversations?
 
* How have Alliance interactions benefited us?
 
* How does the Datomic migration affect our curation and data models?
 
* Alliance data models should be union of existing MOD data models, but does not require curation of all attributes at each MOD
 
* Do WB and our users benefit from creating a ?Genotype class?
 
 
 
=== Phenotype & Disease Face-to-Face ===
 
* Reviewed phenotype curation practices at each MOD
 
* Discussed strain and genotype classes
 
* Generally it's felt that genotypes should only represent full genotypes of actual individuals or strains
 
* For phenotype annotations WB will still need a mechanism to attribute the specific genotypic components that are responsible for the observed phenotype along with a way to capture the complete or background genotype for context
 
* MGI considers strains to be part of a genotype (i.e. background inbred strain into which alleles are introduced)
 
* WB and SGD consider genotype to be part of strain
 
* WB will still consider moving ahead with instantiating a ?Genotype class for capturing genotypes (e.g. for disease models) that don't have an explicitly reported strain name; also for capturing transient genotypes like heterozygosity as well as paternal and maternal contributions/genotypes
 
 
 
 
 
 
 
== June 28, 2018 ==
 
 
 
=== Bulk emailing community for phenotypes ===
 
* Chris & Juancarlos have now setup a pipeline to send requests in bulk
 
* Can send 75 emails at a time; GMail may limit to 500 per day (was hoping for 3,000 at a time)
 
* Sent out 338 emails in past week; already received annotations for 18 papers, ~50 annotations
 
* Papers go back to 1987, earliest paper received curation from in last week 2008
 
* Have sent about 25 papers from 2018, got annotations for 5 of them
 
* Ask SAB about email frequency
 
* Would be good to link to a Textpresso search
 
* Maybe make a link people can click on to indicate there is no phenotype data
 
* Can a submitter chat with a curator? Maybe integrate Olark chat? Would be good to make available, indicate how to chat
 
 
 
=== SAB ===
 
* Chris & Paul will present (curation & intro respectively)
 
* Tools? Raymond can bring up enrichment analysis
 
* Project meeting
 
** Everyone should sign up for a topic if they haven't already
 
 
 
=== Phenotype Ontology ===
 
* Is the Phenotype ontology growing? Very slowly
 
* Received several suggested phenotypes from community in last batch
 
* Want to focus on logical definitions
 
* Mammalian Phenotype ontology - precomposed terms with logical definitions
 
* Raymond trying to use the SObA approach to display logically defined phenotype terms; needs to figure out how to effectively model it
 
* Should get useful feedback at the ICBO meeting in August
 
** Would be good to establish ahead of time what we want to get from the meeting
 
* Want feedback on:
 
** Pre-composed vs. post-composed
 
** Granularity
 
** Logical definitions
 
* Looking up phenotypes: want to make more user friendly
 
** We could map phenotype terms to processes to allow community to find relevant terms
 
** Want to find "male hook" phenotypes; correct term is "male copulatory structure variant"; the term "hook" should find "male copulatory structure"
 
* What is the best use of a user's time? curator's time?
 
* How are ontology logical definitions being used now?
 
** Kimberly: To calculate/reason the inferences based on ontology parentage
 
** Logical definitions are not very visible on relevant pages (like Amigo); should we make it more visible to users?
 
 
 
=== Protein interactions ===
 
* Now up to date; how should we publish this info?
 
* Can make a blog post and/or micropublication and put in next NAR paper
 
 
 
=== Marie-Claire arriving next week ===
 
* What's the plan?
 
* Phenotype curation? GO curation?
 
* Daniela wants help with Picture curation; could explain over Skype
 
** Mega-sync system could help (Daniela will discuss with Valerio)
 
** Images stored on canopus machine; can canopus be bypassed in sync process?
 
** Mega has data limits (50GB+ requires a paid account)
 
** Could setup a custom sync pipeline locally
 
 
 
=== GO vs. Phenotype ===
 
* Lots of GO QC/review going on now; Marie-Claire could start there
 
* Would it be easier to start out with GO or phenotype? Just different
 
* Kimberly could walk her through next week
 
* Paul will try to get Marie-Claire to Toronto for the SAB
 
 
 
=== Alliance U24 grant ===
 
* ~30-pager due September (using familiar format)
 
* Databases, knowledge-bases and tools will all be considered separately
 
* Format changes after September
 
* Scope could be focused on web portal and infrastructure (not curation, for example)
 

Revision as of 14:10, 5 July 2018