WormBase-Caltech Weekly Calls May 2012

From WormBaseWiki
Jump to navigationJump to search

May 3, 2012

Curator Timestamps

  • Determining what data was provided directly by curator vs. what was populated automatically (e.g. mapping scripts)
  • Older data provided by curators that are no longer here will be problematic
  • We should archive all data-processing scripts in GitHub
  • Scripts can be made to create a unique timestamp that identifies that script after the fact


Interaction model change

  • Pulling Variation, Transgene, Antibody, and Expr_pattern tags out of the Interactor_info hash and into the main model
  • This was originally to be able to capture intragenic interactions
  • The problem is the inherent disconnect between an interactor entity (e.g. gene) and these objects
  • Making this change would force a post-curation mechanism that ties these entities together for intuitive data display
    • Such a linking mechanism may be error prone, faulty, and potentially a headache for the web team
  • Is there a better way to handle this type of data?
  • Chris will discuss with Todd to see how much of a problem this would pose to the web team


May 17, 2012

Life_stage model

  • Include species tag?
  • Create an ontology of life stages for different species
  • There are differences in nomenclature
  • Include Matt Berriman and Mark Blaxter in conversation on this?
  • Could include species abbreviation as prefix: e.g. Ppa_L1, Cbr_L3, etc.
  • How should we provide descriptions for each life stage?
  • Two main proposals:
    • 1. General ontology of names: L1, embryo, dauer, adult, etc.
    • 2. Go from most general to most specific
      • e.g. Larval stage -> L1 -> Ppa_L1
  • Maybe make a C. elegans Slim Ontology
  • Anyone requesting a new life stage should provide a description


Adding 3'UTR tag for transgene

  • Capture gene name (not necessarily DNA text)
  • Ignore unc-54 3'UTR
  • Triage by cases that come up for interaction or gene regulation events


Transgene Names

  • WBPaperIDEx vs WBPaperID_Ex (underscore)
  • Would be good to have consistent naming scheme
  • Juancarlos needs to know if any curator wants/needs a history of the name change
    • i.e. keep the original timestamp or create new one
    • Will likely not create new timestamp


Life_stage switch from names to IDs

  • Curators need to let Juancarlos know if the OA dumper is working as expected on Mangolassi (sandbox)
  • Once ready, Juancarlos can move the dumper to Tazendra


Interaction model change

  • Will add two tags to the Interactor_info hash:
    • Intragenic_effector_variation
    • Intragenic_affected_variation
  • Will add four tags to the main Interaction model:
    • Unaffiliated_variation
    • Unaffiliated_transgene
    • Unaffiliated_antibody
    • Unaffiliated_expression_pattern
  • These changes will accommodate intragenic genetic interactions and unsuccessful object-to-gene mappings


May 24, 2012

Rearrangements vs Variations

  • Deficiencies like mgDf50 should be a variation or rearrangement object?
  • Many papers/authors refer to mgDf50 as an allele/variation of daf-16, but not linked in the database
  • Change data model or make 'surrogate' variation/allele for this deficiency?
  • To capture a deficiency as part of a Gene_regulation object, either it has to become a variation or we need to change its data model
  • Easiest thing is to add rearrangements to the Interaction model (and OA)
  • Rearrangements will become a type of interactor, so as to avoid the problem of mapping to many genes, and each of those mapped genes becoming an interactor in the interaction


Coordinating physical interaction curation with BioGRID

  • Currently, BioGRID curates protein-protein interactions for C. elegans
  • We will now have the Interaction OA and model that can accommodate physical interactions
  • We can continue to curate protein-DNA and protein-RNA interactions, as BioGRID does not capture these
  • Need to decide how we and BioGRID share data
  • Can the BioGRID IMS tool communicate with Postgres/OA to coordinate assignment of WB Interaction IDs?
  • Conclusion: if any curator wants to curate protein-protein interactions, use the BioGRID IMS tool; if a curator wants to curate protein-DNA or protein-RNA interactions, for example, will use the Interaction OA


UTR tag for Transgene

  • UTR_of_gene proposed tag name
  • Maybe not specific enough
  • Tag name will be 3_UTR for 3' UTRs


Populating the Transgene OA with transgenes

  • Reg-ex for capturing transgene names from corpus
  • Transgenes found are attached to existing transgenes or made as new transgenes in Postgres/OA
  • Documentation needed on Transgene Wiki
  • Put all scripts and script documentation on GitHub and GitWiki?