Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
 
(806 intermediate revisions by 11 users not shown)
Line 19: Line 19:
 
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
  
GoToMeeting link: https://www.gotomeet.me/wormbase1
+
[[WormBase-Caltech_Weekly_Calls_2020|2020 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2021|2021 Meetings]]
  
= 2019 Meetings =
+
= 2022 Meetings =
  
[[WormBase-Caltech_Weekly_Calls_January_2019|January]]
+
[[WormBase-Caltech_Weekly_Calls_January_2018|January]]
  
[[WormBase-Caltech_Weekly_Calls_February_2019|February]]
+
= January 13th, 2022 =
 
+
== tm variation - gene associations ==
[[WormBase-Caltech_Weekly_Calls_March_2019|March]]
+
*Update on progress and some questions for the Caltech curators
 
+
*Background: not all variations were being associated with genes in the OA table because some of those associations are in WS but not in geneace, so weren't coming through in the nightly geneace dump.  Some variation-gene associations are made as part of the VEP pipeline during the build.
[[WormBase-Caltech_Weekly_Calls_April_2019|April]]
+
**https://github.com/WormBase/website/issues/8262
 
+
**https://wiki.wormbase.org/index.php/WBGene_information_and_status_pipeline
[[WormBase-Caltech_Weekly_Calls_May_2019|May]]
+
**https://wiki.wormbase.org/index.php/Source_and_maintenance_of_non-WBGene_info
 
+
**https://wiki.wormbase.org/index.php/Updating_Postgres_with_New_WS_Information
[[WormBase-Caltech_Weekly_Calls_June_2019|June]]
+
*Wen now downloads several full ACeDB classes from the latest WS release in the form of .ace files so we can also have whatever information is in WS.  Raymond wrote a script to sync those files to tazendra for further processing/use.
 
+
*A few questions that we want to confirm before going forward:
[[WormBase-Caltech_Weekly_Calls_July_2019|July]]
+
**In the WS variations file, there are 2,130,801 total variations (1,911,339 total Live) while in postgres there are currently 106,080.
 
+
***Only include Status = Live variations?
[[WormBase-Caltech_Weekly_Calls_August_2019|August]]
+
***Include regardless of whether there is an associated gene (this seems to be the current practice?).
 
+
***Currently, some variations with a given Method, e.g. Million_mutation, are NOT included. We would continue this filtering.
[[WormBase-Caltech_Weekly_Calls_September_2019|September]]
+
****SNP
 
+
****WGS_Hawaiian_Waterston
[[WormBase-Caltech_Weekly_Calls_October_2019|October]]
+
****WGS_Pasadena_Quinlan
 
+
****WGS_Hobert
 
+
****Million_mutation
== November 7, 2019 ==
+
****WGS_Yanai
 
+
****WGS_De_Bono
=== WS275 Citace upload ===
+
****WGS_Andersen
* Maybe Nov 22 upload to Hinxton
+
****WGS_Flibotte
* CIT curators upload to Spica on Tues Nov 19
+
****WGS_Rose
 
+
***Do we want other filters?
=== ?Genotype class ===
+
**For genes, the ace file contains ALL the gene objects in WB regardless of species.
* [https://docs.google.com/document/d/19hP9r6BpPW3FSAeC_67FNyNq58NGp4eaXBT42Ch3gDE/edit?usp=sharing Working data model document]
+
***We've recently had an author request, via the Acknowledge pipeline, to associate genes of other, less well studied Caenorhabditis species, e.g. C. inopinata, to [https://academic.oup.com/g3journal/article/11/3/jkab022/6121926 their paper].
* Several classes have a "Genotype" tag with text entry
+
***Do we want all Caenorhabditis (and other nematode) species genes in our various gene tables, e.g. obo, paper? Any other species?
** Strain
+
***The effect on the autocomplete, if we include all, probably won't be a problem 1,018,332 vs 306116)
** 2_point_data
+
***Some of the gene ids from other species don't have 'WBGene' prefixes, e.g. Sp34_10109610.  Should we keep this in a separate table from genes with 'WBGene' prefixes?
** Pos_neg_data
 
** Multi_pt_data
 
** RNAi
 
** Phenotype_info
 
** Mass_spec_experiment (no data as of WS273)
 
** Condition
 
* Collecting all genotype text entries yields ~33,000 unique entries, with many different forms:
 
** Species entries, like "Acrobeloides butschlii wild isolate" or "C. briggsae"
 
** Strain entries, like "BA17[fem-1(hc-17)]" or "BB21" or "BL1[pK08F4.7::K08F4.7::GFP; rol-6(+)]"
 
** Anonymous transgenes, like "BEC-1::GFP" or "CAM-1-GFP" or "Ex[Pnpr-9::unc-103(gf)]"
 
** Complex constructs, like "C56C10.9(gk5253[loxP + Pmyo-2::GFP::unc-54 3' UTR + Prps-27::neoR::unc-54 3' UTR + loxP]) II"
 
** Text descriptions, like "Control" or "WT" or "Control worms fed on HT115 containing the L4440 vector without insert" or "N.A."
 
** Bacterial genotypes, like "E. coli [argA, lysA, mcrA, mcrB, IN(rrnD-rrnE)1, lambda-, rcn14::Tn10(DE3 lysogen::lavUV5 promoter -T7 polymerase]"
 
** Including balancers, like "F26H9.8(ok2510) I/hT2 [bli-4(e937) let-?(q782) qIs48] (I;III)"
 
** Reference to parent strain, like "Parent strain is AG359"
 
** Referring to RNAi, like "Pglr-1::wrm-1(RNAi)" or "Phsp-6::gfp; phb-1(RNAi)"
 
** Referring to apparent null or loss of function alleles, like "Phsp-4::GFP(zcIs4); daf-2(-)" or "ced-10(lf)"
 
 
 
=== Gene comparison SObA ===
 
* http://wobr2.caltech.edu/~azurebrd/cgi-bin/soba_multi.cgi?action=Gene+Pair+to+SObA+Graph
 
 
 
 
 
== November 14, 2019 ==
 
 
 
=== TAGC meeting ===
 
* The Allied Genetics Conference next April (2020) in/near Washington DC
 
* Abstract deadline is Dec 5th
 
* Alliance has a shared booth (3 adjacent booths)
 
* Micropublications will have a booth (Karen and Daniela will attend)
 
* Focus will be on highlighting the Alliance
 
* Workshop at NLM in days following TAGC about curation at scale (Kimberly attending and chairing session)
 
 
 
=== Alliance all hands meeting ===
 
* Lightning talk topics?
 
** Single cell RNA Seq (Eduardo)
 
** SimpleMine? (Wen)
 
** SObA? (Raymond); still working on multi-species SObA
 
** Phenotype community curation?
 
** Micropublications?
 
** AFP?
 
 
 
=== Alliance general ===
 
* Alliance needs a curation database
 
** A curation working group was proposed
 
** What needs to happen to get this going?
 
** Would include text mining tools/resources
 
** Would be good to have something like the curation status form
 
** MODS likely have their own special requirements, but there should probably be at least a common minimal set of features
 
** Variant sequence curation could be a good first start (if MODs handle their own variant sequence curation) as a common data type
 
* Micropubs pushing data submission forms; might as well house them within the Alliance
 
* Would be good to have a common (or individually relevant) AFP form(s) for all Alliance members
 

Latest revision as of 18:56, 13 January 2022

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings

2020 Meetings

2021 Meetings

2022 Meetings

January

January 13th, 2022

tm variation - gene associations

  • Update on progress and some questions for the Caltech curators
  • Background: not all variations were being associated with genes in the OA table because some of those associations are in WS but not in geneace, so weren't coming through in the nightly geneace dump. Some variation-gene associations are made as part of the VEP pipeline during the build.
  • Wen now downloads several full ACeDB classes from the latest WS release in the form of .ace files so we can also have whatever information is in WS. Raymond wrote a script to sync those files to tazendra for further processing/use.
  • A few questions that we want to confirm before going forward:
    • In the WS variations file, there are 2,130,801 total variations (1,911,339 total Live) while in postgres there are currently 106,080.
      • Only include Status = Live variations?
      • Include regardless of whether there is an associated gene (this seems to be the current practice?).
      • Currently, some variations with a given Method, e.g. Million_mutation, are NOT included. We would continue this filtering.
        • SNP
        • WGS_Hawaiian_Waterston
        • WGS_Pasadena_Quinlan
        • WGS_Hobert
        • Million_mutation
        • WGS_Yanai
        • WGS_De_Bono
        • WGS_Andersen
        • WGS_Flibotte
        • WGS_Rose
      • Do we want other filters?
    • For genes, the ace file contains ALL the gene objects in WB regardless of species.
      • We've recently had an author request, via the Acknowledge pipeline, to associate genes of other, less well studied Caenorhabditis species, e.g. C. inopinata, to their paper.
      • Do we want all Caenorhabditis (and other nematode) species genes in our various gene tables, e.g. obo, paper? Any other species?
      • The effect on the autocomplete, if we include all, probably won't be a problem 1,018,332 vs 306116)
      • Some of the gene ids from other species don't have 'WBGene' prefixes, e.g. Sp34_10109610. Should we keep this in a separate table from genes with 'WBGene' prefixes?