WBConfCall 2025.03.06-Agenda and Minutes
From WormBaseWiki
Help desk issues
Allele Sequence Change Curation
- Wen screened Textpresso with 2267 alleles (~20% of total alleles without sequence information in WS296). Identified ~300 alleles with their names mentioned together with keywords including "flanking", "transposon", "frameshift", "missense", "nonsense", "point mutation" ...
- Most of the sentences contain valid information to annotate allele sequence changes.
- xc3 -- WBPaper00053572 -- "Next Generation\nSequencing allowed us to identify 30 bp flanking sequences of the alleles xc3, xc4, and xc5 as\nTTGTCCAAGTCTACGTCAATCGGGCAATGT\n[42\nbp\ndeletion]\nAGCCCATAATTCCCCCGTATTCGTATCCCA, TCTACGTCAATCGGGCAATGTCGTCCAGTT - [3 bp\ndeletion,\n41\nbp\ninsertion\n(GGTCTGAATGACTTTCGCACTATTCCCCTATTCGCACGCCT)]\nATTCGCACGTATGATTCGTCGTTGCAATGT, and AACCTTGTCCAAGTCTACGTCAATCGGGCA - [111 bp\ndeletion ] - TCATCCCTTCACTTTGTAATATAATTTTAT, respectively"
- fd139 -- WBPaper00053971 -- "fd139 is a 5-bp deletion in sid-3 exon 13, leading to a frameshift following Ser 989 of the\nSID-3 protein (B0302.1a)"
- za16 -- WBPaper00035549 -- "The R08E3.4(za16) allele has a Mos1 transposon\ninsertion near the end of the 9th exon of R08E3.4"
- How should we proceed from here?
- There are some false positives
- Not all of them contain enough information to place the alleles on the JBrowse but we may get information like allele types and gene associations into WormBase.
- Still requires lots of curator time to get the flanking sequences etc.
- There could be 1000 - 2000 published sequenced alleles pending curation. Shall we make use of community curation?
- Option 1: Send the allele with the published sentences to the labs that generated them.
- Any other thoughts?
- Need to verify the flanks manually. This is the most time-consuming part. Can we automate this step? Not easy to automate.
- Can community curators verify the flanks and upload a screenshot of JBrowse?
- Can promote this at IWM.
- Can keep information in the Remark. Some information can be extracted.
- Should push these data to those who generated it. Ancient paper was not useful to email PIs because they did not do the experiments.
- Connecting alleles with genes and paper already has a lot of value to users.
- Hinxton received about 20 allele sequence submissions per release cycle. Sometimes several hundred per release.
- Community generates ~300 alleles per month.
- All three examples above already have information in the Remark.
- Stavros can do a demo in April.