|
|
(304 intermediate revisions by 11 users not shown) |
Line 18: |
Line 18: |
| | | |
| [[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]] | | [[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]] |
| + | |
| + | [[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]] |
| | | |
| | | |
| GoToMeeting link: https://www.gotomeet.me/wormbase1 | | GoToMeeting link: https://www.gotomeet.me/wormbase1 |
| | | |
| + | = 2020 Meetings = |
| | | |
− | = 2019 Meetings =
| + | [[WormBase-Caltech_Weekly_Calls_January_2020|January]] |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_January_2019|January]] | |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_February_2019|February]]
| |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_March_2019|March]]
| |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_April_2019|April]]
| |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_May_2019|May]]
| |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_June_2019|June]]
| |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_July_2019|July]]
| |
− | | |
− | [[WormBase-Caltech_Weekly_Calls_August_2019|August]]
| |
− | | |
− | | |
− | == September 12, 2019 ==
| |
− | | |
− | === Update on SVM pipeline ===
| |
− | * New SVM pipeline: more analysis and more parameter tuning
| |
− | * avoiding precision (and F-value) as a measure (dependent on ratio of positives and negatives in test set)
| |
− | * For example shown, "dumb" machine starts out with precision above 0.6
| |
− | * G-value (Michael's invention); does not depend on distribution of sets
| |
− | * Applied to various data types
| |
− | * Analysis: 10-fold cross validation
| |
− | ** Randomly select 10% pos and neg (without replacement) and repeat until all papers sampled
| |
− | * F-value changes over different p/n values; G-value does not (essentially flat)
| |
− | * Area Under the Curve (AUC): probability that a random positive scores higher than random negative
| |
− | * AUC values for many WB data types upper 80%'s into 90%'s
| |
− | * Ranjana: How many papers for a good training set? Michael: we don't know yet
| |
− | * Can't reproduce old training sets (for old SVM); provide Michael better training sets if you want improved SVM
| |
− | * If SVM still not good enough, Michael will work on deep neural networks (Tensor Flow)
| |
− | * Michael can provide training sets he has used recently
| |
− | | |
− | === Clarifying definitions of "defective" and "deficient" for phenotypes ===
| |
− | * WB phenotype ontology has many "variant/abnormal" terms and distinct subclass terms for "defective/deficient"
| |
− | * Have tried to create a logical definition pattern for these terms, but the vagueness of the meaning of "defective" and how it is distinct from "abnormal" has stalled the process
| |
− | * What do we mean exactly by "defective" and how, specifically, is this distinct from "abnormal"?
| |
− | * Definitions include meanings or words:
| |
− | ** "Variations in the ability"
| |
− | ** "aberrant"
| |
− | ** "defect"
| |
− | ** "defective"
| |
− | ** "defects"
| |
− | ** "deficiency"
| |
− | ** "deficient"
| |
− | ** "disrupted"
| |
− | ** "impaired"
| |
− | ** "incompetent"
| |
− | ** "ineffective"
| |
− | ** "perturbation that disrupts"
| |
− | ** Failure to execute the characteristic response = abnormal?
| |
− | ** abnormal
| |
− | ** abnormality leading to specific outcomes
| |
− | ** fail to exhibit the same taxis behavior = abnormal?
| |
− | ** failure
| |
− | ** failure OR delayed
| |
− | ** failure, slower OR late
| |
− | ** failure/abnormal
| |
− | ** reduced
| |
− | ** slower
| |
− | | |
− | === Citace upload ===
| |
− | ** Tuesday, Sep 24th
| |
− | | |
− | === Strain to ID mapping ===
| |
− | * Waiting on Hinxton to send strain ID mapping file?
| |
− | * Hopefully we can all get that well before the upload deadline
| |
− | * Will do global replacement at time of citace upload (at least for now)
| |
− | | |
− | === New name server ===
| |
− | * When will this officially go live?
| |
− | * Will we now be able to request strain IDs through the server? Yes
| |
− | | |
− | === SObA Graphs ===
| |
− | * New graphs now live on site (Expression, Gene Ontology, Human Diseases, Phenotypes)
| |
− | * A lot of whitespace padding above and below graph; maybe trim? trimming vertically would ultimately limit the view pane when user wants to zoom in, so we should leave as is for now
| |
− | * Diff tool: Raymond and Juancarlos created a prototype diff tool (for comparing two genes, for example)
| |
− | ** Paul: compared two genes that should be very similar, but there are a lot of differences; may reflect annotation coverage rather than biology
| |
− | | |
− | | |
− | == September 19, 2019 ==
| |
− | | |
− | === Strains ===
| |
− | * Need to wait for new strain IDs from Hinxton before running dumping scripts
| |
− | * Don't edit multi-ontology strain fields in OA for now!
| |
− | * Juancarlos will map free text and ontology-name strain entries to strain IDs once we have the complete mapping file
| |
− | * "Requested strain" field in Disease OA; not dumped, so don't need to worry about right now
| |
| | | |
− | === Alliance literature curation ===
| + | [[WormBase-Caltech_Weekly_Calls_February_2020|February]] |
− | * Working group will be formed soon
| |
− | * Will work out general common pipelines for literature curation
| |
| | | |
− | === SObA Graph relations ===
| + | [[WormBase-Caltech_Weekly_Calls_March_2020|March]] |
− | * Currently only integrating over "is a", "part of" and "regulates"
| |
− | * Maybe we could provide users an option to specify which relations to include, or maybe just exclude "regulates"
| |
| | | |
− | === Author First Pass ===
| + | [[WormBase-Caltech_Weekly_Calls_April_2020|April]] |
− | * Putting together paper for AFP
| |
− | * Reviewing all user input for paper
| |
− | * Asking individual curators to check input
| |
| | | |
| + | [[WormBase-Caltech_Weekly_Calls_May_2020|May]] |
| | | |
− | == September 26, 2019 ==
| |
| | | |
− | === Data mining === | + | == June 4, 2020 == |
− | * Someone in Paul's lab asking to retrieve list of C. elegans orthologs from a list of human genes
| |
− | * Could we build a (simple) Alliance tool to do this?
| |
− | * Could SimpleMine do this? Could we build a SimpleMine-like tool for Alliance?
| |
| | | |
− | === Strains === | + | === Citace (tentative) upload === |
− | * Paul D generated WBStrains for the missing TransgeneOme objects | + | * CIT curators upload to citace on Tuesday, July 7th, 10am Pacific |
− | * Working on a pipeline to identify new TransgeneOme strains at each upload
| + | * Citace upload to Hinxton on Friday, July 10th |
− | * One TransgeneOme object had 2 strains. Possible solutions: dump 2 expression objects that differ only in the Strain or remove the UNIQUE tag in the data model
| |
− | ** Probably best to keep UNIQUE tag
| |
− | * Raymond: concerned about automatically generating strains based on imports from the group
| |
− | * Many odd strain names are coming from the TransgeneOme group; maybe we ought to have more discussions about generating official (following nomenclature standards) strain names from their imports | |
− | * Quarantine strains on initial import; review and accept if pass standards
| |
| | | |
− | === Community phenotype requests August 2019 === | + | === Caltech reopening === |
− | * Sent out new round of phenotype requests on August 20, 21, and 22, 2019 | + | * Paul looking to get plan approved |
− | * 2,626 emails/papers requested | + | * People that want to come to campus need to watch training video |
− | * 114 emails bounced; 5 resent to new addresses | + | * Masks available in Paul's lab |
− | * 460 Phenotype OA community annotations; 181 RNAi OA annotations (641 annotations total) | + | * Can have maximum of 3 people in WormBase rooms at a time; probably best to only allow one person per WB room |
− | * From 94 papers (83 for Phenotype OA; 33 for RNAi; 22 for both) | + | ** Could possibly have 2 people in big room (Church 64) as long as they stay at least 10 feet apart |
− | * By 81 distinct community curators (70 for Phenotype OA; 32 for RNAi OA; 21 for both) | + | * Need to coordinate, maybe make a Google calendar to do so (also Slack) |
− | * 50 papers flagged as not having phenotypes (40 papers DO have phenotypes; 10 marked as negative; 80% failure rate!) | + | * Before and after you go to campus, you need to take your temperature and assess your symptoms (if any) and submit info on form |
− | ** Email states: "If there are no nematode phenotypes in this paper click the following link :" | + | * Also, need to submit who you were in contact with for contact tracing |
− | ** Maybe people are confused, or want to blow off the request | + | * Form is used all week, and hold on to it until asked to be submitted |
− | ** Maybe we can programmatically generate short URLs for the link? May be difficult
| + | * If someone goes in to the office, they could print several forms for people to pick up in WB offices |
− | ** Provide a link to correct mistakes on confirmation page | |
− | * 4 papers flagged for phenotypes (only 2 had curatable phenotypes; 1 had honey-induced phenotypes)
| |
− | * 115 papers with responses (5% response); 24 papers with input that were not main focus of request
| |
− | * Can we provide an opt-out link?
| |
| | | |
− | === Comparison SObA === | + | === Nameserver === |
− | * Actually quite complicated; may require more consideration | + | * Nameserver was down |
| + | * CIT curators would still like to have a single form to interact with |
| + | * Is it possible to create objects at Caltech and let a cronjob assign IDs via the nameserver? May not be a good idea |
| + | * Still putting genotype and all info for a strain in the reason/why field in the nameserver |
| + | * We plan to eventually connect strains to genotypes, but need model changes and curation effort to sort out |
| + | * Hinxton is pulling in CGC strains, how often? |
| + | * Caltech could possibly get a block of IDs |
| | | |
− | === SObA graph and Ontology Browser for papers === | + | === Alliance SimpleMine === |
− | * May be able to modify/hack existing tools for genes and apply to papers | + | * Any updates? 3.1 feature freeze is tomorrow |
− | * Paper-term matching powered by Textpresso | + | * Pending on PI decision; Paul S. will bring it up tomorrow on the Alliance PI call |