Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
(289 intermediate revisions by 8 users not shown)
Line 16: Line 16:
  
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
  
  
Line 21: Line 23:
  
  
= 2018 Meetings =
+
= 2019 Meetings =
 
 
[[WormBase-Caltech_Weekly_Calls_January_2018|January]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_February_2018|February]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_March_2018|March]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_April_2018|April]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_May_2018|May]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_June_2018|June]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_July_2018|July]]
 
 
 
[[WormBase-Caltech_Weekly_Calls_August_2018|August]]
 
  
[[WormBase-Caltech_Weekly_Calls_September_2018|September]]
+
[[WormBase-Caltech_Weekly_Calls_January_2019|January]]
  
[[WormBase-Caltech_Weekly_Calls_October_2018|October]]
+
[[WormBase-Caltech_Weekly_Calls_February_2019|February]]
  
[[WormBase-Caltech_Weekly_Calls_November_2018|November]]
+
[[WormBase-Caltech_Weekly_Calls_March_2019|March]]
  
 +
[[WormBase-Caltech_Weekly_Calls_April_2019|April]]
  
== December 6, 2018 ==
+
[[WormBase-Caltech_Weekly_Calls_May_2019|May]]
  
=== SObA for disease (& other ontology?) annotations ===
+
[[WormBase-Caltech_Weekly_Calls_June_2019|June]]
* Next Alliance all-hands call in February, present SObA for Skunkworks (innovation talks)?
 
* SObA top-slicing (summary view) gets tricky; Raymond looking into better solutions, if available
 
* SObA summary could be redundant with ribbon? SObA would provide dynamic, trimmed nodes
 
  
=== Worm phenotype ontology ===
+
[[WormBase-Caltech_Weekly_Calls_July_2019|July]]
* Meeting today at 12pm Pacific, 3pm Eastern
 
* In process of adding back obsolete terms to the ontology
 
* Question about "has_obo_namespace" attribute; not required for ODK process; used by WB pipelines?
 
  
=== ISB Meeting, Cambridge, UK, April 2019 ===
+
[[WormBase-Caltech_Weekly_Calls_August_2019|August]]
* Abstracts from WB?
 
** New Author First Pass - Daniela, Valerio, Kimberly, et al.
 
** Automated gene descriptions - Ranjana, gene description working group
 
  
=== GO expression data using Uberon anatomy terms ===
 
* Some GO curators curating C. elegans expression annotations (from synapse annotation project) to Uberon anatomy terms, not WBbt terms
 
* Kimberly will look into enforcing Noctua editor to use C. elegans anatomy ontology when creating C. elegans annotations
 
* Can possibly perform automated mapping if appropriate cross-references exist
 
* Kimberly has already found annotations to inappropriate Uberon terms
 
  
 +
== September 12, 2019 ==
  
== December 13, 2018 ==
+
=== Update on SVM pipeline ===
 +
* New SVM pipeline: more analysis and more parameter tuning
 +
* avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
 +
* For example shown, "dumb" machine starts out with precision above 0.6
 +
* G-value (Michael's invention); does not depend on distribution of sets
 +
* Applied to various data types
 +
* Analysis: 10-fold cross validation
 +
** Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
 +
* F-value changes over different p/n values; G-value does not (essentially flat)
 +
* Area Under the Curve (AUC): probability that a random positive scores higher than random negative
 +
* AUC values for many WB data types upper 80%'s into 90%'s
 +
* Ranjana: How many papers for a good training set? Michael: we don't know yet
 +
* Can't reproduce old training sets (for old SVM); provide Michael better training sets if you want improved SVM
 +
* If SVM still not good enough, Michael will work on deep neural networks (Tensor Flow)
 +
* Michael can provide training sets he has used recently
  
=== Protege Tutorial ===
+
=== Clarifying definitions of "defective" and "deficient" for phenotypes ===
* Doodle poll open: https://doodle.com/poll/kn49rd3rggymn68g
+
* WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
* Please fill out poll if you are interested in attending
+
* Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
* Resources available
+
* What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
** here (right-click and Save As): https://www.dropbox.com/s/6fygtby4x6d2t0r/ProtegeShortCourse.zip?dl=0
+
* Definitions include meanings or words:
** and here: https://protege.stanford.edu/shortcourse/201810/resources.html
+
** "Variations in the ability"
 +
** "aberrant"
 +
** "defect"
 +
** "defective"
 +
** "defects"
 +
** "deficiency"
 +
** "deficient"
 +
** "disrupted"
 +
** "impaired"
 +
** "incompetent"
 +
** "ineffective"
 +
** "perturbation that disrupts"
 +
** Failure to execute the characteristic response = abnormal?
 +
** abnormal
 +
** abnormality leading to specific outcomes
 +
** fail to exhibit the same taxis behavior = abnormal?
 +
** failure
 +
** failure OR delayed
 +
** failure, slower OR late
 +
** failure/abnormal
 +
** reduced
 +
** slower
  
=== Updates to Gene Regulation OA ===
+
=== Citace upload ===
* Removed two unused fields/tables: Trans-regulator Seq and Trans-regulated Seq
+
** Tuesday, Sep 24th
* (In progress) Now a single field for anatomy, life stage, and subcellular localization (SCL) (as opposed to a set of each for each type of regulation result)
 
* May require ?Interaction model change if anatomy, lifestage, and/or SCL to be annotated without a result type
 
  
=== Micropublication workshop IWM 2019 ===
+
=== Strain to ID mapping ===
* Micropublication group will apply for a workshop
+
* Waiting on Hinxton to send strain ID mapping file?
* If they get it they will need a tiny (~5 min) slot in the general WB
+
* Hopefully we can all get that well before the upload deadline
 +
* Will do global replacement at time of citace upload (at least for now)
  
=== RNAi uniquely_mapped tags ===
+
=== New name server ===
* User pointed out that an RNAi object in WormMine was not flagged as "Uniquely mapped"
+
* When will this officially go live?
* Turns out that the current RNAi mapping pipeline does not add a "Uniquely_mapped" tag
+
* Will we now be able to request strain IDs through the server? Yes
* All existing tags are coming from older RNAi objects in CitaceMinus
 
* Chris working on .ACE file to delete all of these tags from CitaceMinus
 
* Chris will upload the .ACE file to citpub@spica in the Data_for_midbuild directory for Wen
 
  
=== Worm wiring & Worm Atlas ===
+
=== SObA Graphs ===
* Todd and Raymond meeting this afternoon
+
* New graphs now live on site (Expression, Gene Ontology, Human Diseases, Phenotypes)
* Chris will join
+
* A lot of whitespace padding above and below graph; maybe trim? trimming vertically would ultimately limit the view pane when user wants to zoom in, so we should leave as is for now
* 2pm Pacific time
+
* Diff tool: Raymond and Juancarlos created a prototype diff tool (for comparing two genes, for example)
 +
** Paul: compared two genes that should be very similar, but there are a lot of differences; may reflect annotation coverage rather than biology
  
=== WS270 upload ===
 
* Jan 11th upload to Hinxton
 
* Tuesday, January 8th citace upload 10am
 
  
=== Alliance grant renewal ===
+
== September 19, 2019 ==
* Send Paul drafts for working group updates before next Thursday Dec 20, just to give him a heads up on content
 
* Paul currently working off All-Hands meeting presentations
 
  
=== Alliance vs. MODs efficiencies ===
+
=== Strains ===
* Compare user pipelines for getting info like:
+
* Need to wait for new strain IDs from Hinxton before running dumping scripts
** for a human disease gene, what are the phenotypes in model organisms?
+
* Don't edit multi-ontology strain fields in OA for now!
* What are the best work products of the Alliance so far?
+
* Juancarlos will map free text and ontology-name strain entries to strain IDs once we have the complete mapping file
** Expression ribbon
+
* "Requested strain" field in Disease OA; not dumped, so don't need to worry about right now
** Orthology
 
** Molecular interactions? Good to have nicer interface, consolidation of data; no automated comparative tools yet, but can get equivalent results with some clicking
 
** How to best measure impact?
 
  
=== Noctua imports ===
+
=== Alliance literature curation ===
* Can we export/duplicate a Noctua model to apply to a different gene?
+
* Working group will be formed soon
* Can export in GPAD or GAF
+
* Will work out general common pipelines for literature curation
* Want to seed Noctua models with MOD data; efficient, large scale import
 
  
== December 20, 2018 ==
+
=== SObA Graph relations ===
 +
* Currently only integrating over "is a", "part of" and "regulates"
 +
* Maybe we could provide users an option to specify which relations to include, or maybe just exclude "regulates"
  
=== Authors, Papers, Meeting Abstracts ===
+
=== Author First Pass ===
*https://github.com/WormBase/website/issues/6802
+
* Putting together paper for AFP
*Authors on older (i.e. 2000) meeting abstracts don't always have associated WBPerson objects
+
* Reviewing all user input for paper
**Anything we can/should do about this?
+
* Asking individual curators to check input
*Author on some older abstracts are also listed First Name Last Name as opposed to Last Name, First Name
 
**This results in display errors - Todd and Sibyl will try to fix this
 
**Any way to fix the source data via script?
 
*Can meeting abstracts be added to TPC?
 
*Some WB users are still using the old Textpresso.  Put a message on that site that it will soon be retired?
 
**The old Textpresso site is what comes up first in a Google search for 'textpresso'.
 
*Other suggestions from Jonathan:
 
**Having the possibility to restrict a search to a given field (e.g. title) and being able to rank the different columns in the results (i.e. by journal or year, etc, not just Doc score) would be welcome.
 

Revision as of 16:39, 19 September 2019

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings


GoToMeeting link: https://www.gotomeet.me/wormbase1


2019 Meetings

January

February

March

April

May

June

July

August


September 12, 2019

Update on SVM pipeline

  • New SVM pipeline: more analysis and more parameter tuning
  • avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
  • For example shown, "dumb" machine starts out with precision above 0.6
  • G-value (Michael's invention); does not depend on distribution of sets
  • Applied to various data types
  • Analysis: 10-fold cross validation
    • Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
  • F-value changes over different p/n values; G-value does not (essentially flat)
  • Area Under the Curve (AUC): probability that a random positive scores higher than random negative
  • AUC values for many WB data types upper 80%'s into 90%'s
  • Ranjana: How many papers for a good training set? Michael: we don't know yet
  • Can't reproduce old training sets (for old SVM); provide Michael better training sets if you want improved SVM
  • If SVM still not good enough, Michael will work on deep neural networks (Tensor Flow)
  • Michael can provide training sets he has used recently

Clarifying definitions of "defective" and "deficient" for phenotypes

  • WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
  • Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
  • What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
  • Definitions include meanings or words:
    • "Variations in the ability"
    • "aberrant"
    • "defect"
    • "defective"
    • "defects"
    • "deficiency"
    • "deficient"
    • "disrupted"
    • "impaired"
    • "incompetent"
    • "ineffective"
    • "perturbation that disrupts"
    • Failure to execute the characteristic response = abnormal?
    • abnormal
    • abnormality leading to specific outcomes
    • fail to exhibit the same taxis behavior = abnormal?
    • failure
    • failure OR delayed
    • failure, slower OR late
    • failure/abnormal
    • reduced
    • slower

Citace upload

    • Tuesday, Sep 24th

Strain to ID mapping

  • Waiting on Hinxton to send strain ID mapping file?
  • Hopefully we can all get that well before the upload deadline
  • Will do global replacement at time of citace upload (at least for now)

New name server

  • When will this officially go live?
  • Will we now be able to request strain IDs through the server? Yes

SObA Graphs

  • New graphs now live on site (Expression, Gene Ontology, Human Diseases, Phenotypes)
  • A lot of whitespace padding above and below graph; maybe trim? trimming vertically would ultimately limit the view pane when user wants to zoom in, so we should leave as is for now
  • Diff tool: Raymond and Juancarlos created a prototype diff tool (for comparing two genes, for example)
    • Paul: compared two genes that should be very similar, but there are a lot of differences; may reflect annotation coverage rather than biology


September 19, 2019

Strains

  • Need to wait for new strain IDs from Hinxton before running dumping scripts
  • Don't edit multi-ontology strain fields in OA for now!
  • Juancarlos will map free text and ontology-name strain entries to strain IDs once we have the complete mapping file
  • "Requested strain" field in Disease OA; not dumped, so don't need to worry about right now

Alliance literature curation

  • Working group will be formed soon
  • Will work out general common pipelines for literature curation

SObA Graph relations

  • Currently only integrating over "is a", "part of" and "regulates"
  • Maybe we could provide users an option to specify which relations to include, or maybe just exclude "regulates"

Author First Pass

  • Putting together paper for AFP
  • Reviewing all user input for paper
  • Asking individual curators to check input