WormBase-Caltech Weekly Calls
From WormBaseWiki
Previous Years
GoToMeeting link: https://www.gotomeet.me/wormbase1
2018 Meetings
June 7, 2018
Alliance literature pipeline working group
- Has anyone gotten back to Carol Bult about WB/Textpresso membership in the group? Not yet
- Paul & Kimberly can join; Kimberly will sit in on group meetings
SPELL migration
- SGD put SPELL into the cloud; will give us code to set up WB SPELL in cloud
- Want to have common interface but two instances (yeast & worm)
- Worm data in SPELL may be more complicated than the yeast data (different species, platforms, meta data, etc.)
- Mike Cherry proposed to use GTEx to replace SPELL. (www.gtexportal.org)
- Trying to get GTEX to have same functionality as SPELL
Progress report
- Wen can generate numbers of changes since last year
- 5-year progress report coming up (for funding agencies)
- Want a progress report-like document to give to SAB in July
- WS259 - WS265
Paper author name
- Jonathan Ewbank wrote in about an incorrect author name
- Issue for WBPaper00048704: https://wormbase.org/resources/paper/WBPaper00048704#0--10
- First author is "Li, C" as in PubMed (correct name) in WormBase as "Chun, L"
- Will make a GitHub ticket; web team may need to look at
Lab class
- Lab data is not clean/consistent; has gotten messy
- Cecilia trying to clean it up
- There is conflicting data from lab class, author class, and person class
- Currently there is a ~redundant curation pipeline; considering pulling person info into lab class
- When looking at lab page, if there is a problem with a person's info, changes would be made (requested) in the person class, not the lab class
- Cecilia will contact labs to ask about correctness of information
- Can have a discussion with Ann and Aric at CGC
How many C. elegans genes have "good" knockouts?
- Paul giving talk next week, wants to report
- Chris will look into
- Mitani did 1,000 deletions in last year
Server logs
- SimpleMine logs are all coming from Amazon; will need to ask Todd about SimpleMine usage (from AWS stats?)
Migration of lab data
- Data for labs was coming from Name Server (in Hinxton) - Comment:[PD] Not coming from the Nameserver, geneace was the only source of Lab data prior to handover.
- Now Paul D has stopped curating lab info - Comment:[PD] Build config has been updated to take the majority of lab data from citace rather than geneace
Database File Class remove/format ---------------------------------------------------------------------------------------------------------------- db=geneace file=geneace_Laboratory.ace class=Laboratory format="Alleles WBVar\d{8}" delete=Representative delete=Registered_lab_members delete=Past_lab_members delete=Allele_designation delete=Strain_designation delete=Address db=citace file=caltech_Laboratory.ace class=Laboratory delete=Alleles format="Representative WBPerson\d{1,5}" format="Registered_lab_members WBPerson\d{1,5}" format="Past_lab_members WBPerson\d{1,5}"
- Now that Cecilia is curating, will pull lab info from Postgres
June 21, 2018
SAB
- Lightning talks at project meeting; everyone should consider what they want to present (5 minutes with 5 minutes for questions)
- 45 minutes curation talk to SAB; Chris volunteered; anyone else interested?
- What data types are being incorporated with the Alliance? What have we gained from those conversations?
- How have Alliance interactions benefited us?
- How does the Datomic migration affect our curation and data models?
- Alliance data models should be union of existing MOD data models, but does not require curation of all attributes at each MOD
- Do WB and our users benefit from creating a ?Genotype class?
Phenotype & Disease Face-to-Face
- Reviewed phenotype curation practices at each MOD
- Discussed strain and genotype classes
- Generally it's felt that genotypes should only represent full genotypes of actual individuals or strains
- For phenotype annotations WB will still need a mechanism to attribute the specific genotypic components that are responsible for the observed phenotype along with a way to capture the complete or background genotype for context
- MGI considers strains to be part of a genotype (i.e. background inbred strain into which alleles are introduced)
- WB and SGD consider genotype to be part of strain
- WB will still consider moving ahead with instantiating a ?Genotype class for capturing genotypes (e.g. for disease models) that don't have an explicitly reported strain name; also for capturing transient genotypes like heterozygosity as well as paternal and maternal contributions/genotypes
June 28, 2018
Bulk emailing community for phenotypes
- Chris & Juancarlos have now setup a pipeline to send requests in bulk
- Can send 75 emails at a time; GMail may limit to 500 per day (was hoping for 3,000 at a time)
- Sent out 338 emails in past week; already received annotations for 18 papers, ~50 annotations
- Papers go back to 1987, earliest paper received curation from in last week 2008
- Have sent about 25 papers from 2018, got annotations for 5 of them
- Ask SAB about email frequency
- Would be good to link to a Textpresso search
- Maybe make a link people can click on to indicate there is no phenotype data
- Can a submitter chat with a curator? Maybe integrate Olark chat? Would be good to make available, indicate how to chat
SAB
- Chris & Paul will present (curation & intro respectively)
- Tools? Raymond can bring up enrichment analysis
- Project meeting
- Everyone should sign up for a topic if they haven't already
Phenotype Ontology
- Is the Phenotype ontology growing? Very slowly
- Received several suggested phenotypes from community in last batch
- Want to focus on logical definitions
- Mammalian Phenotype ontology - precomposed terms with logical definitions
- Raymond trying to use the SObA approach to display logically defined phenotype terms; needs to figure out how to effectively model it
- Should get useful feedback at the ICBO meeting in August
- Want feedback on:
- Pre-composed vs. post-composed
- Granularity
- logical definitions