WormBase-Caltech Weekly Calls June 2015
From WormBaseWikiJump to navigationJump to search
- 1 June 2015
- 1.1 June 4, 2015
- 1.2 June 11, 2015
- 1.2.1 modSeek concise descriptions for C. elegans genes
- 1.2.2 Expression Cluster data
- 1.2.3 Global correlations of genes across datasets
- 1.2.4 Allele-Phenotype Form / Community Curation
- 1.2.5 Term Enrichment Tool for WormBase and WormBase Parasite
- 1.2.6 Tracking WBPerson PIs
- 1.2.7 Keywords for papers
- 1.3 June 18, 2015
June 4, 2015
- Added link to Allele molecular lesion form (which currently also has phenotype submission fields)
- Added remark that optional fields apply to "ALL OBSERVED PHENOTYPES" for an allele
- Added link in WBPerson term info to Person page with publications showing
- Do we want individual e-mails sent out for each submission, or, say, one daily summary e-mail?
- The concern is that someone may abuse the form and send a bunch of junk to some random person's e-mail address
- Consensus: will stick with individual e-mails for now
- Should confirmation e-mails contain a "I didn't submit this data" link to nullify abusive submissions? Would people use it?
- Consensus: Yes
- Do we want to set up a mechanism to block abusive IP addresses?
- Consensus: Not yet
- Should we pre-populate the e-mail field once a user selects a name/WBPerson?
- Not sure we came to an agreement; will go ahead with trying this
- It seems that many users want to enter data quickly and TAB away
- Do we want to allow exact matches to register even if a term is not selected from the autocomplete dropdown?
- Consensus: Not going to worry about it now
- Do we want a general free-text remark field for the entire submission? For individual optional fields?
- Consensus: Yes
- Will want to add a remark, and possibly a hyperlink, to/about a bulk data submission form/pipeline/spreadsheet
- What would be the best mechanism for bulk data submission?
- Not clear, can stick with Google Docs for now
Allele function ontology (ky)
- I could not find an ontology that was ready built to adopt; few genetic model organisms, of those, FB has the most useful. Other mods use controlled vocabularies (CVs) that either are not comprehensive enough (just has the basic null, loss of function, etc.) or molecular based (VariO, SO)
- made an ontology using Schedl defined terms and FB hierarchy
- Matt Brush of Monarch Initiative is making a GENO ontology (https://github.com/monarch-initiative/GENO-ontology genetic variation specified in genotypes, to support genotype-to-phenotype (G2P) data aggregation and analysis across diverse research communities and sources)for a cross-species genetic-based controlled vocabulary, I'll be submitting our terms to them, and modifying our terms to keep in sync with GENO.
- I've invited FB curators to work with us to make our terms species-nonspecific.
- Requested model change for WS250 with new terms.
June 11, 2015
modSeek concise descriptions for C. elegans genes
- Proteins with no known domains, points of interest; what is the orthology, etc.?
- Descriptions available for ~70% of the C. elegans genome
- Can use expression data
- Some cases where nothing is known; we should say that; community can speak up if they know something
- Not necessarily what we want to do on the WormBase site
- Any genes/proteins with orthology should have a descripiton
Expression Cluster data
- Typical for a gene to have ~70 expression clusters
- What are the important types/associations: anatomy, life stage
- Expression clusters are not annotated to processes, GO Terms
- How many clusters are related to processes?
- Many involved in gene regulation, stress, aging, immune response
- Difference between "upregulated by process" and "involved in process"
- "Differentially expressed in response to ..."
Global correlations of genes across datasets
- We can use many of the datasets (interactions, similar GO terms, co-expression, processes/topics, etc.) to draw interactions between genes
- "Correlated interactions", i.e. guilt-by-association
- We could use what SGD uses
Allele-Phenotype Form / Community Curation
- We would like to encourage participation in the community curation as much as possible
- What kind of outreach would be best?
- We can send personal (or automated) e-mails to authors of papers for which we still need, for example, allele-phenotype annotations
- We want to make it clear that people will get attribution for their data contributions
- We need to consider how to best do this: data model, #Evidence hash, dumping from the OA, display contributions in a "Curation Contributions" widget on the WBPerson page, etc.
- Difficult to lookup PMIDs; we can make adjustments to the form:
- Add a notification next to the "Your Name" field to link to list of publications
- Instead of linking to WormBase WBPerson page, we can link to a custom HTML table with all publications for that person; table could include PMID, WBPaper ID, title, authors list, etc.
Term Enrichment Tool for WormBase and WormBase Parasite
- Kimberly and Jane Lomax discussed
- Want to use web services for WB users to access the PANTHER tool via the WormBase site
- We could setup tomorrow for C. elegans, but other species would need to be added to the PANTHER database
- Brugia, onchocerca can be added; strongyloides needs to be added to the reference protein data set
Tracking WBPerson PIs
- Transferring lab designations from retiring PIs to new lab heads, contact persons
Keywords for papers
- Have keywords for older papers, not newer papers
- Mesh terms, added by CGC by hand?
- Automatic updates? We can use XML to automate the keyword annotations
- GO Terms annotated to papers, but only by cross-reference from other sites
- Wen would like the keywords for large datasets
June 18, 2015
Itai Yanai dataset
- Spatio-temporal expression (whole embryo up to larval stage and cultured blastomeres) graphs for all C. elegans genes
- Daniela in contact to coordinate reception of data
- Use CEL-seq technique (RNA-seq on single cells)
Tutorial video for Allele-phenotype form
- Chris working on an efficient pipeline for making tutorial videos
- Working out narration first, video next
- Best if we can have pipeline that requires minimal post-processing
- How to ask WormBase users about data curation priorities
- Ranking data types, in order, or asking generally about importance of each data type
- Curators should take a look at the survey and provide feedback on questions relevant to their respective data types
- Feedback should be sent to Todd
- Depth of curation per data type?
- Ask about whether users care at all about each data type?