Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
(310 intermediate revisions by 11 users not shown)
Line 18: Line 18:
  
 
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
  
  
 
GoToMeeting link: https://www.gotomeet.me/wormbase1
 
GoToMeeting link: https://www.gotomeet.me/wormbase1
  
 +
= 2020 Meetings =
  
= 2019 Meetings =
+
[[WormBase-Caltech_Weekly_Calls_January_2020|January]]
 
 
[[WormBase-Caltech_Weekly_Calls_January_2019|January]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_February_2019|February]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_March_2019|March]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_April_2019|April]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_May_2019|May]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_June_2019|June]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_July_2019|July]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_August_2019|August]]
 
 
 
 
 
== September 12, 2019 ==
 
 
 
=== Update on SVM pipeline ===
 
* New SVM pipeline: more analysis and more parameter tuning
 
* avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
 
* For example shown, "dumb" machine starts out with precision above 0.6
 
* G-value (Michael's invention); does not depend on distribution of sets
 
* Applied to various data types
 
* Analysis: 10-fold cross validation
 
** Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
 
* F-value changes over different p/n values; G-value does not (essentially flat)
 
* Area Under the Curve (AUC): probability that a random positive scores higher than random negative
 
* AUC values for many WB data types upper 80%'s into 90%'s
 
* Ranjana: How many papers for a good training set? Michael: we don't know yet
 
* Can't reproduce old training sets (for old SVM); provide Michael better training sets if you want improved SVM
 
* If SVM still not good enough, Michael will work on deep neural networks (Tensor Flow)
 
* Michael can provide training sets he has used recently
 
 
 
=== Clarifying definitions of "defective" and "deficient" for phenotypes ===
 
* WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
 
* Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
 
* What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
 
* Definitions include meanings or words:
 
** "Variations in the ability"
 
** "aberrant"
 
** "defect"
 
** "defective"
 
** "defects"
 
** "deficiency"
 
** "deficient"
 
** "disrupted"
 
** "impaired"
 
** "incompetent"
 
** "ineffective"
 
** "perturbation that disrupts"
 
** Failure to execute the characteristic response = abnormal?
 
** abnormal
 
** abnormality leading to specific outcomes
 
** fail to exhibit the same taxis behavior = abnormal?
 
** failure
 
** failure OR delayed
 
** failure, slower OR late
 
** failure/abnormal
 
** reduced
 
** slower
 
 
 
=== Citace upload ===
 
** Tuesday, Sep 24th
 
 
 
=== Strain to ID mapping ===
 
* Waiting on Hinxton to send strain ID mapping file?
 
* Hopefully we can all get that well before the upload deadline
 
* Will do global replacement at time of citace upload (at least for now)
 
 
 
=== New name server ===
 
* When will this officially go live?
 
* Will we now be able to request strain IDs through the server? Yes
 
 
 
=== SObA Graphs ===
 
* New graphs now live on site (Expression, Gene Ontology, Human Diseases, Phenotypes)
 
* A lot of whitespace padding above and below graph; maybe trim? trimming vertically would ultimately limit the view pane when user wants to zoom in, so we should leave as is for now
 
* Diff tool: Raymond and Juancarlos created a prototype diff tool (for comparing two genes, for example)
 
** Paul: compared two genes that should be very similar, but there are a lot of differences; may reflect annotation coverage rather than biology
 
 
 
  
== September 19, 2019 ==
+
[[WormBase-Caltech_Weekly_Calls_February_2020|February]]
  
=== Strains ===
+
[[WormBase-Caltech_Weekly_Calls_March_2020|March]]
* Need to wait for new strain IDs from Hinxton before running dumping scripts
 
* Don't edit multi-ontology strain fields in OA for now!
 
* Juancarlos will map free text and ontology-name strain entries to strain IDs once we have the complete mapping file
 
* "Requested strain" field in Disease OA; not dumped, so don't need to worry about right now
 
  
=== Alliance literature curation ===
+
[[WormBase-Caltech_Weekly_Calls_April_2020|April]]
* Working group will be formed soon
 
* Will work out general common pipelines for literature curation
 
  
=== SObA Graph relations ===
+
[[WormBase-Caltech_Weekly_Calls_May_2020|May]]
* Currently only integrating over "is a", "part of" and "regulates"
 
* Maybe we could provide users an option to specify which relations to include, or maybe just exclude "regulates"
 
  
=== Author First Pass ===
 
* Putting together paper for AFP
 
* Reviewing all user input for paper
 
* Asking individual curators to check input
 
  
 +
== June 4, 2020 ==
  
== September 26, 2019 ==
+
=== Citace (tentative) upload ===
 +
* CIT curators upload to citace on Tuesday, July 7th, 10am Pacific
 +
* Citace upload to Hinxton on Friday, July 10th
  
=== Data mining ===
+
=== Caltech reopening ===
* Someone in Paul's lab asking to retrieve list of C. elegans orthologs from a list of human genes
+
* Paul looking to get plan approved
* Could we build a (simple) Alliance tool to do this?
+
* People that want to come to campus need to watch training video
* Could SimpleMine do this? Could we build a SimpleMine-like tool for Alliance?
+
* Masks available in Paul's lab
 +
* Can have maximum of 3 people in WormBase rooms at a time; probably best to only allow one person per WB room
 +
** Could possibly have 2 people in big room (Church 64) as long as they stay at least 10 feet apart
 +
* Need to coordinate, maybe make a Google calendar to do so (also Slack)
 +
* Before and after you go to campus, you need to take your temperature and assess your symptoms (if any) and submit info on form
 +
* Also, need to submit who you were in contact with for contact tracing
 +
* Form is used all week, and hold on to it until asked to be submitted
 +
* If someone goes in to the office, they could print several forms for people to pick up in WB offices
  
=== Strains ===
+
=== Nameserver ===
* Paul D generated WBStrains for the missing TransgeneOme objects
+
* Nameserver was down
* Working on a pipeline to identify new TransgeneOme strains at each upload
+
* CIT curators would still like to have a single form to interact with
* One TransgeneOme object had 2 strains. Possible solutions: dump 2 expression objects that differ only in the Strain or remove the UNIQUE tag in the data model
+
* Is it possible to create objects at Caltech and let a cronjob assign IDs via the nameserver? May not be a good idea
* Raymond: concerned about automatically generating strains based on imports from the group
+
* Still putting genotype and all info for a strain in the reason/why field in the nameserver
* Many odd strain names are coming from the TransgeneOme group; maybe we ought to have more discussions about generating official (following nomenclature standards) strain names from their imports
+
* We plan to eventually connect strains to genotypes, but need model changes and curation effort to sort out
* Quarantine strains on initial import; review and accept if pass standards
+
* Hinxton is pulling in CGC strains, how often?
 +
* Caltech could possibly get a block of IDs
  
=== Community phenotype requests August 2019 ===
+
=== Alliance SimpleMine ===
* Sent out new round of phenotype requests on August 20, 21, and 22, 2019
+
* Any updates? 3.1 feature freeze is tomorrow
* 2,626 emails/papers requested
+
* Pending on PI decision; Paul S. will bring it up tomorrow on the Alliance PI call
* 114 emails bounced; 5 resent to new addresses
 
* 460 Phenotype OA community annotations; 181 RNAi OA annotations (641 annotations total)
 
* From 94 papers (83 for Phenotype OA; 33 for RNAi; 22 for both)
 
* By 81 distinct community curators (70 for Phenotype OA; 32 for RNAi OA; 21 for both)
 
* 50 papers flagged as not having phenotypes (40 papers DO have phenotypes; 10 marked as negative; 80% failure rate!)
 
** Email states: "If there are no nematode phenotypes in this paper click the following link :"
 
* 4 papers flagged for phenotypes (only 2 had curatable phenotypes; 1 had honey-induced phenotypes)
 
* 115 papers with responses (5% response); 24 papers with input that were not main focus of request
 

Revision as of 00:09, 5 June 2020

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings


GoToMeeting link: https://www.gotomeet.me/wormbase1

2020 Meetings

January

February

March

April

May


June 4, 2020

Citace (tentative) upload

  • CIT curators upload to citace on Tuesday, July 7th, 10am Pacific
  • Citace upload to Hinxton on Friday, July 10th

Caltech reopening

  • Paul looking to get plan approved
  • People that want to come to campus need to watch training video
  • Masks available in Paul's lab
  • Can have maximum of 3 people in WormBase rooms at a time; probably best to only allow one person per WB room
    • Could possibly have 2 people in big room (Church 64) as long as they stay at least 10 feet apart
  • Need to coordinate, maybe make a Google calendar to do so (also Slack)
  • Before and after you go to campus, you need to take your temperature and assess your symptoms (if any) and submit info on form
  • Also, need to submit who you were in contact with for contact tracing
  • Form is used all week, and hold on to it until asked to be submitted
  • If someone goes in to the office, they could print several forms for people to pick up in WB offices

Nameserver

  • Nameserver was down
  • CIT curators would still like to have a single form to interact with
  • Is it possible to create objects at Caltech and let a cronjob assign IDs via the nameserver? May not be a good idea
  • Still putting genotype and all info for a strain in the reason/why field in the nameserver
  • We plan to eventually connect strains to genotypes, but need model changes and curation effort to sort out
  • Hinxton is pulling in CGC strains, how often?
  • Caltech could possibly get a block of IDs

Alliance SimpleMine

  • Any updates? 3.1 feature freeze is tomorrow
  • Pending on PI decision; Paul S. will bring it up tomorrow on the Alliance PI call