Difference between revisions of "WormBase-Caltech Weekly Calls"
From WormBaseWiki
Jump to navigationJump to searchLine 211: | Line 211: | ||
* SAB member: curation involves expert decision making/analysis on issues, not just straight-forward data acquisition | * SAB member: curation involves expert decision making/analysis on issues, not just straight-forward data acquisition | ||
* Maybe we would have better curation consistency if individual curators focused on particular topics; became experts for certain subject matter | * Maybe we would have better curation consistency if individual curators focused on particular topics; became experts for certain subject matter | ||
+ | * Possibility to have Alliance all-hands call in Fall | ||
=== Automated gene descriptions === | === Automated gene descriptions === | ||
* Difficult to handle genes with high information content; many ontology term annotations | * Difficult to handle genes with high information content; many ontology term annotations | ||
* How do we simplify descriptions? Using higher-level terms, slim terms? Gets tricky | * How do we simplify descriptions? Using higher-level terms, slim terms? Gets tricky |
Revision as of 19:46, 22 February 2018
Contents
- 1 Previous Years
- 2 2018 Meetings
- 2.1 February 1, 2018
- 2.2 February 8, 2018
- 2.2.1 Release schedule
- 2.2.2 New York Worm Meeting
- 2.2.3 GO curation
- 2.2.4 Phenotype curation
- 2.2.5 Expression curation
- 2.2.6 Gene regulation curation
- 2.2.7 Physical interaction curation
- 2.2.8 Disease curation
- 2.2.9 Expression cluster curation
- 2.2.10 April and May Worm Meetings
- 2.2.11 WormBook
- 2.2.12 Papers
- 2.2.13 AGR
- 2.3 February 15, 2018
- 2.4 February 22, 2018
Previous Years
GoToMeeting link: https://www.gotomeet.me/wormbase1
2018 Meetings
February 1, 2018
Automated gene descriptions - orthology
- Some genes have human orthology mentioned in automated descriptions, even though the orthology call has not been called in DIOPT
- WormBase uses EnsemblCompara and other methods (not aggregate method like DIOPT)
- Orthology synchrony is a challenge; WormBase and FlyBase may need to pay special attention to orthology calls and discrepancies
- DIOPT is purely automated, does not consider other information about orthology evidence
- We should be clear about how the orthology calls are made
Next upload
- Unclear of exact date
- Probably end of March
SimpleMine issue
- Redundant genes in input list are merged
- Should SimpleMine provide an option to keep redundancies?
- Give option up front? Provide submission step to point out redundancies? Ask for choice?
- We can default to show row-by-row correspondence, and display the number of redundant entries
- Conclusion: Make an option for users to indicate if they want row-by-row correspondence or a merged list
Cell type expression
- Waterston paper
- 40,000 random cells, clusters sequenced individually to a depth of 20,000 reads; ~1000 genes per cell; cluster data; make judgement call as to what cell types they likely are
- For now, we can do a simple annotation: significantly expressed genes for each cell type
- Supplemental table S5 for neurons
- Maybe just ignore the hybrid calls like AQM/PVM, etc.
- It may be good to isolate the single cell data from other expression data
- We should annotate/capture the expression clusters
- Would be good to be able to do enrichment analysis on the clusters; compare data sets
- Data has not been placed in SPELL yet, Gary considered the data a work in progress
- We can communicate with Waterston group; are they collecting more data?
- Wen will take another look at the data
- Gary W. concerned about the reported/assumed/inferred identity of the cells in the paper
- Probably cannot curate to individual cells, but we can annotate to a higher level term
- We want to annotate and display expression enrichment as well as presence/absence calls
February 8, 2018
Release schedule
- Wen will ask Hinxton to update the published release schedule (for next data upload)
New York Worm Meeting
- Wen and Kimberly will present a WormBase tutorial on March 24
- Wen communicated to Oliver Hobert; suggested topics:
- Multi-gene (batch) search tools
- How literature info gets into WormBase? Curation process?
- Should we discuss completeness?
GO curation
- New simple input form for Noctua, being developed at USC
- Not very much GO curation happening at WB right now
- Protein-2-GO pipeline
- Do we have a good Phenotype-2-GO(Process) mapping pipeline? We have our old mappings; not very reliable; would need to spend more time expanding the worm phenotype ontology and GO to improve
- Cellular component curation will come in from WB expression curation
- Don't have pipeline for Interactions-2-GO
- Textpresso Molecular Functions pipeline?
- geneprod and catalyticact data types for molecular function pipeline
- Textpresso can send molecular function annotations to Noctua
- For high-level pathway curation; we should probably read WormBook chapters (or other reviews) and develop pathways (using non-experimental evidence codes)
- We could potentially seed Noctua models from Reactome
- We would like to have complete curation for major pathways for gene enrichment analysis
- Roles of small molecules in Noctua models still being worked out
Phenotype curation
- Chris has had community curation pipeline on back burner while updating Wiki and dealing with AGR, WormMine, etc.
- Will get back to soon; will resend email requests for newer papers sent over a year ago
Expression curation
- Daniela getting back to expression curation after Micropublication stuff has quieted down
Gene regulation curation
- April came across dataset involving regulation of siRNAs that don't seem to have gene objects in WB
- May need to instantiate genes for these?
Physical interaction curation
- SVM classification; do we flag a paper as negative that has protein interactions but no interactions for C. elegans
- Can we generate a good SVM that only identifies WB-curatable papers?
Disease curation
- Now curating the specific genetic entities involved in a disease model
- Will also capture environmental conditions, treatments (e.g. ameliorates, exacerbates)
- Curation in-line with AGR standards
- Evidence code needed for assertions that an animal is a model of disease in which the assertion is based on background knowledge and experimental evidence, together
- Evidence Code Ontology (ECO) is developing a new term to accommodate
- Disease curators can use new evidence code as well as any existing codes
- Is there a definition of a "disease model"?
- What are the minimal criteria for considering something a disease model?
- WB and FB curators focus on cellular phenotype and relation to the disease
Expression cluster curation
- 27 papers in pipeline
- Will then work on "single-cell" RNAseq
- Wen, Raymond, and David should discuss
April and May Worm Meetings
- Midwest and Colorado meetings
- Wen submitting abstracts
- Wen and Kimberly can write up abstract template for New York meeting and send around to be modified for future meetings
WormBook
- Published last version for legacy site
Papers
- Daniel requested 13 (older) papers from Caltech library through inter-library process
- Received more than half as images; would need optical character recognition (OCR) for Textpresso purposes
- What is the state of the art of OCR now? How good is it? Can we ask Caltech library for the service?
- Are these high priority papers? Need to check to see if worth processing
AGR
- Disease working group setting up a face-to-face meeting
- Variant working group may need a face-to-face meeting as well
- Expression working group working out initial AGR site data display mockups
- Interaction working group; we will want to incorporate miRNA/target interactions (RNA-RNA interactions); will look at miRBase
February 15, 2018
Model changes
- Models freeze March 2nd
- Will need to get model changes proposed and tested by then
Sys admin of Tazendra/Mangolassi
- Raymond will discuss with Juancarlos to centralize
- Need good documentation for forms, tools, etc.
- Will be a push to put all code for tools and forms on GitHub
Tazendra forms, tools bug this week
- There is a dependency on Mangolassi for some tools
- Mangolassi went down and caused problems
- Would be good to decouple the two machines
AGR
- May not get an AGR all-hands face-to-face meeting before the summer
- Working groups can decide to have face-to-face meeting
- People should speak up if they have interest in visiting other MODs/sites; can be arranged
- Consider what grant proposals could come out of such meetings/visits
- Currently no ontology working group, no anatomy working group
- Could establish a preliminary working group; reach out to relevant people
- Anatomy working group issues may come up in expression working group
- Daniela will keep Raymond updated on relevant issues that come up with the expression group
Ontology Browser gene lists
- Chris requesting change to gene list display from WOBr
- https://github.com/WormBase/website/issues/6190
- Should provide WBGene IDs, not just gene public names
- That was the original intent, but using WBGene IDs was, for some reason, causing issues when developing the tool; will need to revisit that issue to get WBGene IDs displayed
February 22, 2018
Acquiring data from Postgres
- Hyperlinking project
- Hyperlinking entities from papers to WormBase
- Identifies genes, alleles, other bioentities
- Genetics (GSA), eLife, PLoS
- Partner with SGD and FlyBase
- Curators check link fidelity
- Now expanding to other Alliance members (rat, mouse, etc. papers)
- What do we do in 2-3 years when we want all links to go to Alliance website?
- Need to be thinking about it
- Maybe discuss with others in 2-3 months (now is very busy)
- Central data repository for all data (MOD) files would be helpful to developers (and users)
- FTP site would be good
- Will expand entities to reagents and link to suppliers
- Could possibly just dump all Postgres data into one place, Karen's developers could write scripts to process that data; cronjob?
- Juancarlos will setup a URL that can be used to access the data; will setup on cronjob every day at 8pm
- Really need data to be as up-to-date as possible
- In Silico page with embedded i-frames (WormBase page inset)
Alliance SAB meeting
- SAB critical of:
- not being unified
- not being organized
- Now everyone has committed
- Concern still exists about autonomy of MODs
- Will each user community still be served effectively by the Alliance?
- Organization is easier when all are committed
- Maybe bring in a professional organizer/project manager (long term)
- New aggressive timeline for progress
- April 23rd meeting; need to give material a week earlier
- Need year and a half plan; each working group will provide details
- Only 2 full time Alliance staff; may need more on project; difficult for individuals to split time/effort
- SAB member: curation involves expert decision making/analysis on issues, not just straight-forward data acquisition
- Maybe we would have better curation consistency if individual curators focused on particular topics; became experts for certain subject matter
- Possibility to have Alliance all-hands call in Fall
Automated gene descriptions
- Difficult to handle genes with high information content; many ontology term annotations
- How do we simplify descriptions? Using higher-level terms, slim terms? Gets tricky