WormBase-Caltech Weekly Calls
From WormBaseWiki
Contents
- 1 Previous Years
- 2 2017 Meetings
- 2.1 November 2, 2017
- 2.2 November 9, 2017
- 2.2.1 Citace upload
- 2.2.2 New models file for WS263
- 2.2.3 Data migration call
- 2.2.4 Caltech library
- 2.2.5 Micropublications
- 2.2.6 AGR disease working group update
- 2.2.7 AGR orthology
- 2.2.8 Help Desk question: finding all alcohol dehydrogenase genes
- 2.2.9 Alliance/AGR interactions working group
- 2.2.10 AGR gene expression data working group
- 2.2.11 Site visits
- 2.3 November 16, 2017
- 2.4 November 30, 2017
- 2.5 December 7th, 2017
Previous Years
2017 Meetings
November 2, 2017
Site visits
- Invited to give 25+5 minute talk at Worcester Area Worm Meeting on November 14th
- Had particular interest in micropublications
- 25-30 minutes probably insufficient to present everything that we want
- Also, fairly short notice
- Travel budget?
- Would be good to meet with individual labs and lab members to discuss WormBase
- Will probably decline this offer and wait until a longer talk next year
- Will point them to micropublication.org to find out more info
- Maybe Karen/Daniela could do a webinar during the Nov 14 slot
GO annotation for Expression cluster
- Wen asking about some GO annotation details for expression clusters
- Want to annotate a particular data set with a GO term
- Wen will discuss with Kimberly
Tazendra issue
- Had a problem Sunday-Monday
- We should consider moving curation database to a new machine/location and/or creating a backup system
- Move to the cloud? Would reduce maintenance time but will add cost
- Install on different local server?
- What are the requirements? Disk space? Computation?
Marker help desk question
- Someone looking for promoter sequences of pan-neuronal genes; neuronal marker
- Can go to "neuron" anatomy page and search expression patterns table for the term "marker"; not optimal
- Can also (reverse) sort the "Expression Pattern" column of table to pull up "Marker" annotations
- Create a "Markers" widget? "Tissue Marker" or "Expression Marker"? Daniela will create ticket
- Existing markers in Associations widget will remain there
- Are markers still being curated?
November 9, 2017
Citace upload
- Upload files to Spica for Wen by 6pm next Friday (17th)
New models file for WS263
- Changes snuck up on some CIT curators
- Anatomy function model: Proposal to make Remark entry unique; Raymond asking to remove the UNIQUE in the model
- Still want to have multiple remarks (each on a separate, new line)
- May request a rollback of Anatomy function model change
- Are curators tracking model changes on GitHub?
Data migration call
- At noon PST today, if people want to join and ask questions
Caltech library
- Wen asked about getting articles that we can't get from Caltech library
- Was told each person can get 10 articles per year through inter-library loan
- We need about 500 articles per year that aren't in Caltech library
- Can get articles within an hour
- Maybe we can talk to library to make an agreement to get more articles
- If we get preprints, will it be additional cost to get official final version?
Micropublications
- Currently, curation is performed manually
AGR disease working group update
- Next focus, for AGR 1.3 release in March 2018, is to pull allele objects with basic info into AGR
- First we will only pull in alleles that have disease data and that only have associations to a single gene
- Basic allele information will be provided on respective AGR gene pages in an Alleles table, along with associated disease data (?)
- Alleles may also be referenced within a gene page disease table
- Disease pages will show Association tables with alleles in addition to genes
- In both contexts, alleles will link to the MOD allele page (until AGR develops an allele page)
- Alleles will be added to the disease association file (DAF) and respective JSON files
- Dedicated AGR allele pages will come at a later release, probably as a work product of the AGR Variants working group
- We will want to work closely with the AGR Variants working group
- Alleles will be stepping stone to other genotype components and then complex genotypes, as well as treatments/conditions/chemicals etc.
- Also discussed the possibility of a Disease ribbon, showing diseases in AGR orthologs
- If we show disease associations inferred by orthology, we need to be explicit about evidence codes and data provenance; maybe different tables for experimental versus inferred associations?
AGR orthology
- AGR orthology based exclusively on sequence similarity
- Some "orthologs" are actually just homologs
- Jae sent email to info@alliancegenome.org last week (Wed Nov 1) but hasn't heard back from anyone
- How do we explicitly define "ortholog"? Sequence similarity? Synteny? Functional complementation?
- How do we accommodate both manual and automated assertions of orthology?
Help Desk question: finding all alcohol dehydrogenase genes
- There is no single root "alcohol dehydrogenase" term in the GO MF branch, but many more specific terms that exist in different branches
- This means that one has to search for terms that explicitly have "alcohol dehydrogenase" in the name of the term and cannot take advantage of the Ontology Browser to see all gene associations, direct and inferred
- This approach also will miss terms like "methanol dehydrogenase" or other logical descendant terms
- Also, can search for protein domains/motifs that explicitly have "alcohol dehydrogenase" in the name of the motif, but domains are not (as far as I'm aware, in InterPro or PFAM at least) organized into an ontology
- Note: GO tries to classify MFs according to the Enzyme Classification system. In this classification, the alcohol dehydrogenase (NAD) activity is a sibling of methanol dehydrogenase activity.
- There is a comment associated with the alcohol dehydrogenase entry, E.C. 1.1.1.1, that says:
Comments: A zinc protein. Acts on primary or secondary alcohols or hemi-acetals with very broad specificity; however the enzyme oxidizes methanol much more poorly than ethanol. The animal, but not the yeast, enzyme acts also on cyclic secondary alcohols.
- This may explain why methanol dehydrogenase is a sibling and not a child of alcohol dehydrogenase in this classification.
Alliance/AGR interactions working group
- Now formed, first meeting tomorrow (Friday Nov 10th) at 1pm PST/4pm EST
- Can find folder and related documents in the Alliance "Working Groups" Google folder
AGR gene expression data working group
- Starting up now; awaiting first meeting
- Talk to Wen and/or Daniela if you want to join
Site visits
- Daniela and Karen might be able to join Bay Area worm meeting
November 16, 2017
Inter-library loan
- Getting papers (otherwise unavailable) through inter-library loan
- Have to fill out request forms for (each?) paper that we request
- No charge unless it is a rush request
Social media
- Wen attended social media training at Caltech
- How to use social media to promote their work
- Sean Carroll has > 1 million Twitter followers
- Facebook Caltech posts; target students and faculty, as well as donors
- Social media strategies: videos useful, schedule posts
- Interviews
- Videos: viewers lose interest after ~1 minute
- Would be good to make blog post announcements on FaceBook
- Use Twitter, FaceBook, WB blog to announce next WB tutorial
- Can post several times per month, rather than once per ~3 months
- Wen can work with Ranjana and Todd, to make blog posts available on FB and Twitter
- Helpful to have fun posts; photos, videos, interviews
- Wen connected with Andy Golden about Baltimore worm meeting on FB
- WB needs to follow other members of the community, PIs, etc.
- Can make posts about interesting papers
- Can post top community curator for the month
- Would be good to have a social media point person, Wen and Ranjana? Involve Todd
- One post per week?
- Good to have specific posts about papers and researchers
- Reward labs with most community curation, highlight their research
Juancarlos' vacation
- May want contingency plan while he is away (Dec 19 - Feb 15)
- Next upload Jan 19
- Will need to consider required changes to models/dumpers before he leaves
November 30, 2017
ISB Meeting, April 2018
- https://www.biocuration.org/biocuration-2018-call-for-abstracts/
- Meeting in Shanghai (could meet with worm labs there)
- This year abstracts are invited for the following topic areas:
- Precision Medicine
- Phenotypes, genotypes, and variants
- Data Standards and Ontologies
- Text Mining
- Functional Annotation
- Community Annotation
- Data Integration and Visualization
- Deep Learning in curation process
- Softwares, Applications and Systems in biocuration
- Curation Standards and Best Practice; inference from evidence; data and annotation quality
- What does WormBase need to take away from the meeting? What does WB have to present? Important projects
- Micropublication group/member will go
- Could send people on behalf of AGR or AGR projects/working group
- Could present about interactions
Expression Cluster -> WOBr
- It looks like we have a data annotation issue where the data is about {AFD and/or AWB} but WormBase annotation usage is meant as {AFD} and {AWB}.
- The data source publication is <http://www.wormbase.org/tools/tree/run?name=WBPaper00024671;class=Paper;expand=Refers_to#Refers_to>, and the dataset is Expression cluster » WBPaper00024671:AFD_AWB_vs_unsorted_upregulated <http://www.wormbase.org/species/c_elegans/expression_cluster/WBPaper00024671:AFD_AWB_vs_unsorted_upregulated#013--10>.
- Given that {AFD and/or AWB} is not a natural anatomy group (that is, there is no specific functional meaning nor is it a frequently used grouping), I don't think the solution is to invent an anatomy ontology term just for this dataset.
- Instead, I propose that we remove the {AFD and/or AWB} datasets from associating with AFD and AWB respectively, so that the associations don't get interpreted incorrectly. This 'rule' should be applied to all ambiguous datasets.
- Need to carefully consider each expression cluster for downstream incorporation into WOBr and/or TEA; e.g. don't want to use downregulated set of genes to inform a presence or absence call
AGR Gene Descriptions working group
- Met yesterday for first time
- Reviewed how each group does gene descriptions
- Yeast mostly (entirely?) manual; SGD gold standard descriptions
- Many are automated (worm, rat, fly)
- Group will meet twice a month, for now
- Want to standardize the structure/construction of descriptions
Contingency plan for Juancarlos' vacation
- Juancarlos off Dec 19th - Feb 15
- Hold off on model changes or dumper changes until after Juancarlos gets back
- Next upload is Jan 19, 2018
- Make sure Hinxton knows not to change any CIT-relevant models
RRIDs
- Karen: has anyone received any comments or questions about RRIDs from the community?
- Karen working on document for AGR to express MODs' concerns about RRIDs
- Authors should know that RRIDs are not required, may only cause more problems than solve
- We want to consider where RRIDs could be helpful (antibodies?), but push back where it is more problematic
- Raymond: Cell Press taking in tables of reagents with URLs; we should make sure that C. elegans reagents should conform to WB/AGR standards
December 7th, 2017
Author First Pass Form
- Analysis of current flags and numbers of entries
- Overall approach is to move from a flagging form to a validation and data entry form
Questions
- Question for Wen: Does it make sense to include an afp flag for expression cluster?
- Expression, other expression and marker seem to have the same meaning, cannot find marker and expression corresponding fields, are these still active? If so what do they map to?
- Transgene: which part of the form populates the transgene table?
- Question for Karen: confirm that the afp_transgene table is coming from the genetics/G3 pipeline and not the AFP form.
- Question for Karen: extvariation is coming from the genetics/G3 pipeline?
- Question for Karen: newstrains is coming from the genetics/G3 pipeline?
- Transgenes: will we be mining for transgenes and ask authors to confirm/add
- Question for WB curators: Are we missing data types?