WBConfCall 2025.03.06-Agenda and Minutes

From WormBaseWiki
Revision as of 17:25, 6 March 2025 by Wchen (talk | contribs) (→‎Allele Sequence Change Curation)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Help desk issues


Allele Sequence Change Curation

  • Wen screened Textpresso with 2267 alleles (~20% of total alleles without sequence information in WS296). Identified ~300 alleles with their names mentioned together with keywords including "flanking", "transposon", "frameshift", "missense", "nonsense", "point mutation" ...
  • Most of the sentences contain valid information to annotate allele sequence changes.
    • xc3 -- WBPaper00053572 -- "Next Generation\nSequencing allowed us to identify 30 bp flanking sequences of the alleles xc3, xc4, and xc5 as\nTTGTCCAAGTCTACGTCAATCGGGCAATGT\n[42\nbp\ndeletion]\nAGCCCATAATTCCCCCGTATTCGTATCCCA, TCTACGTCAATCGGGCAATGTCGTCCAGTT - [3 bp\ndeletion,\n41\nbp\ninsertion\n(GGTCTGAATGACTTTCGCACTATTCCCCTATTCGCACGCCT)]\nATTCGCACGTATGATTCGTCGTTGCAATGT, and AACCTTGTCCAAGTCTACGTCAATCGGGCA - [111 bp\ndeletion ] - TCATCCCTTCACTTTGTAATATAATTTTAT, respectively"
    • fd139 -- WBPaper00053971 -- "fd139 is a 5-bp deletion in sid-3 exon 13, leading to a frameshift following Ser 989 of the\nSID-3 protein (B0302.1a)"
    • za16 -- WBPaper00035549 -- "The R08E3.4(za16) allele has a Mos1 transposon\ninsertion near the end of the 9th exon of R08E3.4"
  • How should we proceed from here?
    • There are some false positives
    • Not all of them contain enough information to place the alleles on the JBrowse but we may get information like allele types and gene associations into WormBase.
    • Still requires lots of curator time to get the flanking sequences etc.
    • There could be 1000 - 2000 published sequenced alleles pending curation. Shall we make use of community curation?
      • Option 1: Send the allele with the published sentences to the labs that generated them.
      • Any other thoughts?
  • Need to verify the flanks manually. This is the most time-consuming part. Can we automate this step? Not easy to automate.
    • Can community curators verify the flanks and upload a screenshot of JBrowse?
    • Can promote this at IWM.
    • Can keep information in the Remark. Some information can be extracted.
    • Should push these data to those who generated it. Ancient paper was not useful to email PIs because they did not do the experiments.
    • Connecting alleles with genes and paper already has a lot of value to users.
    • Hinxton received about 20 allele sequence submissions per release cycle. Sometimes several hundred per release.
    • Community generates ~300 alleles per month.
    • All three examples above already have information in the Remark.
    • Stavros can do a demo in April.