Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
(15 intermediate revisions by 2 users not shown)
Line 39: Line 39:
 
[[WormBase-Caltech_Weekly_Calls_July_2019|July]]
 
[[WormBase-Caltech_Weekly_Calls_July_2019|July]]
  
 +
[[WormBase-Caltech_Weekly_Calls_August_2019|August]]
  
== August 1, 2019 ==
 
  
=== Life stage public names missing in WS271 ===
+
== September 12, 2019 ==
* Did we ever get a patch in for this?
 
* WormMine only has WBls IDs (no public name) for almost all life stages
 
* Wen will resend the patch file
 
  
=== 2020 WB NAR paper ===
+
=== Update on SVM pipeline ===
* Who's contributing?
+
* New SVM pipeline: more analysis and more parameter tuning
** Raymond, Chris, Ranjana, Valerio, Daniela, Kimberly
+
* avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
* Topics
+
* For example shown, "dumb" machine starts out with precision above 0.6
** Automated descriptions
+
* G-value (Michael's invention); does not depend on distribution of sets
** Ontology tools (SObA)
+
* Applied to various data types
** Community curation
+
* Analysis: 10-fold cross validation
** Author first pass
+
** Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
 +
* F-value changes over different p/n values; G-value does not (essentially flat)
 +
* Area Under the Curve (AUC): probability that a random positive scores higher than random negative
 +
* AUC values for many WB data types upper 80%'s into 90%'s
 +
* Ranjana: How many papers for a good training set? Michael: we don't know yet
 +
* Can't reproduce old training sets (for old SVM); provide Michael better training sets if you want improved SVM
 +
* If SVM still not good enough, Michael will work on deep neural networks (Tensor Flow)
 +
* Michael can provide training sets he has used recently
  
=== 2019 IWM Workshop videos ===
+
=== Clarifying definitions of "defective" and "deficient" for phenotypes ===
* On YouTube, but not public yet
+
* WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
* Chris will make them public and send links to Ranjana for the blog post
+
* Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
 +
* What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
 +
* Definitions include meanings or words:
 +
** "Variations in the ability"
 +
** "aberrant"
 +
** "defect"
 +
** "defective"
 +
** "defects"
 +
** "deficiency"
 +
** "deficient"
 +
** "disrupted"
 +
** "impaired"
 +
** "incompetent"
 +
** "ineffective"
 +
** "perturbation that disrupts"
 +
** Failure to execute the characteristic response = abnormal?
 +
** abnormal
 +
** abnormality leading to specific outcomes
 +
** fail to exhibit the same taxis behavior = abnormal?
 +
** failure
 +
** failure OR delayed
 +
** failure, slower OR late
 +
** failure/abnormal
 +
** reduced
 +
** slower
  
 +
=== Citace upload ===
 +
** Tuesday, Sep 24th
  
== August 15, 2019 ==
+
=== Strain to ID mapping ===
 +
* Waiting on Hinxton to send strain ID mapping file?
 +
* Hopefully we can all get that well before the upload deadline
 +
* Will do global replacement at time of citace upload (at least for now)
  
=== GO Alliance slim terms ===
+
=== New name server ===
* We need to update our GO slim terms for WB GO ribbons to be in sync with Alliance
+
* When will this officially go live?
* May need to watch out for terms that don't apply to worms
+
* Will we now be able to request strain IDs through the server? Yes
* Raymond gets slim terms into Solr from OBO release file; Sibyl collecting from different source; should make the same (pull from WB FTP site?)
 
  
=== Phenotype ontology patternization ===
+
=== SObA Graphs ===
* Now have 676 terms patternized (27% of 2,506 terms total)
+
* New graphs now live on site (Expression, Gene Ontology, Human Diseases, Phenotypes)
* Have reviewed the class hierarchy, collecting list of unexpected class subsumptions
+
* A lot of whitespace padding above and below graph; maybe trim? trimming vertically would ultimately limit the view pane when user wants to zoom in, so we should leave as is for now
* Issues to address collected here: https://docs.google.com/document/d/1IWtQbEQ-elM-U5SQyU4VfIH3vdJp6taVMGViIjGyVks/edit?usp=sharing
+
* Diff tool: Raymond and Juancarlos created a prototype diff tool (for comparing two genes, for example)
 
+
** Paul: compared two genes that should be very similar, but there are a lot of differences; may reflect annotation coverage rather than biology
 
 
== August 22, 2019 ==
 
 
 
=== Obsolete ontology terms in Postgres ===
 
* There are currently 172 GO annotations in the GO OA, 94 in Expr OA, and 54 in Pic OA referring to obsolete GO terms
 
** https://docs.google.com/spreadsheets/d/14iG3-s0GrZ3_W87iOjD6tZiQiUJklRRjgs6ARFWi9E4/edit?usp=sharing
 
* We would like a mechanism for detecting and alerting curators to obsolete ontology terms in the OA/Postgres
 
 
 
=== Community phenotype requests ===
 
* Sent out new round of phenotype requests on August 20, 21, and 22 (today) 2019
 
* 2,627 emails/papers requested
 
* 112 emails bounced; 4 resent to new addresses
 
* 205 Phenotype OA community annotations; 55 RNAi OA annotations, from 47 papers, by 42 distinct community curators (so far)
 
* Also, 3 worksheets submitted, for 4 papers
 
* 35 papers flagged as not having phenotypes
 
* 86 papers with responses (3% response; still early)
 
* Need to coordinate with the AFP request pipeline
 
 
 
=== GO slim terms ===
 
* SObA highlights Alliance slim terms, but doesn't correspond to ribbon
 
* Want to use same slim terms used for ribbons
 
* Add slim terms into ACEDB? One option
 
* Ribbon order of slim terms is an issue
 
* Decided not to store ribbon info (slim terms and term order) in ACEDB, but rather in web code
 
* Will have to manually synchronize with Alliance if Alliance changes its ribbon
 
 
 
== August 29, 2019 ==
 
 
 
=== Caltech use of Hinxton name service ===
 
* Name service :
 
** https://names.wormbase.org/gene
 
* API documentation :
 
** https://names.wormbase.org/api-docs/index.html#!
 
* Sibyl is wondering :
 
** "it would be nice to know when and how curators use the name service. And does any script managed by Caltech write data into the name services?"
 
** "what needs to be done to adopt the new name service. For example, what concern do you or the curators have about adopting the new name service, what kind of tests will allow the curators to trust that the new name service is working correctly, what changes to the new name service is needed for Caltech to be able to use it."
 
* Daniela, Chris, Karen are the only Caltech curators who use it.
 
* Daniela's workflow is :
 
** Look up variation through OA
 
** Look up variation through name service (what's the URL for that ?)
 
** Create object through name service, then enter information
 
** Create temporary entry for OA through Caltech postgres services
 
* Juancarlos has emailed Sibyl + Matt + Caltech with what we know so far.  Karen and Chris, please reply if you have other name service uses beyond what Daniela does.
 
* Name service google/wormbase login doesn't work right now for Caltech curators, Matt's been asked to add people.
 
 
 
===Anatomy term names===
 
* While going through the concise descriptions text Daniela have noted down some terms that could benefit from having a more descriptive name.
 
** e.g.: Pn.p hermaphrodite -> Pn.p hermaphrodite vulval precursor cell. e.g. P3.p hermaphrodite (WBbt:0008112) -> P3.p hermaphrodite vulval precursor cell
 
List of other potential terms:
 
<pre>
 
AB -> embryonic founder cell AB
 
EMS -> embryonic founder cell EMS -> should the definition of EMS be changed from ‘Embryonic cell’ to ‘Embryonic founder cell’? See WormAtlas founder cell definition.
 
C -> embryonic founder cell C
 
D ->  embryonic founder cell D
 
E ->  embryonic founder cell E
 
MS -> embryonic founder cell MS
 
Psub1 -> embryonic founder cell Psub1 (or simply embryonic founder cell P1)
 
Psub2 -> embryonic founder cell Psub2
 
Psub3 -> embryonic founder cell Psub3
 
Psub4 -> germline founder cell Psub4
 
G1 -> Postembryonic blast cell G1
 
G2 -> Postembryonic blast cell G2
 
</pre>
 
 
 
* we will keep the names as they are.
 
* for the term 'Pn.p hermaphrodite' Raymond will change the name into  'Pn.p in hermaphrodite'
 

Revision as of 21:07, 12 September 2019

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings


GoToMeeting link: https://www.gotomeet.me/wormbase1


2019 Meetings

January

February

March

April

May

June

July

August


September 12, 2019

Update on SVM pipeline

  • New SVM pipeline: more analysis and more parameter tuning
  • avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
  • For example shown, "dumb" machine starts out with precision above 0.6
  • G-value (Michael's invention); does not depend on distribution of sets
  • Applied to various data types
  • Analysis: 10-fold cross validation
    • Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
  • F-value changes over different p/n values; G-value does not (essentially flat)
  • Area Under the Curve (AUC): probability that a random positive scores higher than random negative
  • AUC values for many WB data types upper 80%'s into 90%'s
  • Ranjana: How many papers for a good training set? Michael: we don't know yet
  • Can't reproduce old training sets (for old SVM); provide Michael better training sets if you want improved SVM
  • If SVM still not good enough, Michael will work on deep neural networks (Tensor Flow)
  • Michael can provide training sets he has used recently

Clarifying definitions of "defective" and "deficient" for phenotypes

  • WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
  • Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
  • What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
  • Definitions include meanings or words:
    • "Variations in the ability"
    • "aberrant"
    • "defect"
    • "defective"
    • "defects"
    • "deficiency"
    • "deficient"
    • "disrupted"
    • "impaired"
    • "incompetent"
    • "ineffective"
    • "perturbation that disrupts"
    • Failure to execute the characteristic response = abnormal?
    • abnormal
    • abnormality leading to specific outcomes
    • fail to exhibit the same taxis behavior = abnormal?
    • failure
    • failure OR delayed
    • failure, slower OR late
    • failure/abnormal
    • reduced
    • slower

Citace upload

    • Tuesday, Sep 24th

Strain to ID mapping

  • Waiting on Hinxton to send strain ID mapping file?
  • Hopefully we can all get that well before the upload deadline
  • Will do global replacement at time of citace upload (at least for now)

New name server

  • When will this officially go live?
  • Will we now be able to request strain IDs through the server? Yes

SObA Graphs

  • New graphs now live on site (Expression, Gene Ontology, Human Diseases, Phenotypes)
  • A lot of whitespace padding above and below graph; maybe trim? trimming vertically would ultimately limit the view pane when user wants to zoom in, so we should leave as is for now
  • Diff tool: Raymond and Juancarlos created a prototype diff tool (for comparing two genes, for example)
    • Paul: compared two genes that should be very similar, but there are a lot of differences; may reflect annotation coverage rather than biology