Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchLine 46: | Line 46: | ||
=== Update on SVM pipeline === | === Update on SVM pipeline === | ||
* New SVM pipeline: more analysis and more parameter tuning | * New SVM pipeline: more analysis and more parameter tuning | ||
+ | * avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set) | ||
+ | * "dumb" machine starts out with precision above 0.6 | ||
+ | * G-value (Michael's invention); does not depend on distribution of sets | ||
+ | * Applied to various data types | ||
+ | * Analysis: 10-fold cross validation | ||
+ | ** Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled | ||
+ | * F-value changes over different p/n values; G-value does not (essentially flat) | ||
+ | * Area Under the Curve (AUC): probability that a random positive scores higher than random negative | ||
+ | * AUC values for many data types upper 80%'s into 90%'s | ||
=== Clarifying definitions of "defective" and "deficient" for phenotypes === | === Clarifying definitions of "defective" and "deficient" for phenotypes === |
Revision as of 18:10, 12 September 2019
Contents
Previous Years
GoToMeeting link: https://www.gotomeet.me/wormbase1
2019 Meetings
September 12, 2019
Update on SVM pipeline
- New SVM pipeline: more analysis and more parameter tuning
- avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
- "dumb" machine starts out with precision above 0.6
- G-value (Michael's invention); does not depend on distribution of sets
- Applied to various data types
- Analysis: 10-fold cross validation
- Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
- F-value changes over different p/n values; G-value does not (essentially flat)
- Area Under the Curve (AUC): probability that a random positive scores higher than random negative
- AUC values for many data types upper 80%'s into 90%'s
Clarifying definitions of "defective" and "deficient" for phenotypes
- WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
- Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
- What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
- Definitions include meanings or words:
- "aberrant"
- "defective"
- "defect"
- "defects"
- "deficiency"
- "disrupted"
- "ineffective"
- "perturbation that disrupts"
- "variations in the ability"
- failure to execute the characteristic response = abnormal?
- abnormal
- abnormality leading to specific outcomes
- fail to exhibit the same taxis behavior = abnormal?
- failure
- failure OR delayed
- failure/abnormal