Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
Line 1: Line 1:
 +
= Previous Years =
 +
 
[[WormBase-Caltech_Weekly_Calls_2009|2009 Meetings]]
 
[[WormBase-Caltech_Weekly_Calls_2009|2009 Meetings]]
  
 +
[[WormBase-Caltech_Weekly_Calls_2011|2011 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2012|2012 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2013|2013 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2014|2014 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2015|2015 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2016|2016 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2017|2017 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2018|2018 Meetings]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_2019|2019 Meetings]]
 +
 +
 +
 +
 +
= 2020 Meetings =
 +
 +
[[WormBase-Caltech_Weekly_Calls_January_2020|January]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_February_2020|February]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_March_2020|March]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_April_2020|April]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_May_2020|May]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_June_2020|June]]
 +
 +
[[WormBase-Caltech_Weekly_Calls_July_2020|July]]
 +
 +
 +
==August 6th, 2020==
 +
 +
===Experimental conditions data flow into Alliance===
 +
*Experimental conditions in disease annotations: WB has inducers (used to recapitulate the disease condition) and modifiers (a modifier can ameliorate, exacerbate, or have no effect, on the disease condition)
 +
*We use the WB Molecule CV for Inducers and Modifiers in disease annotation
 +
*Experimental conditions in phenotype annotations: are free text (captured in remarks); will probably need to formalize later on
 +
*So for data flow into Alliance:
 +
**In the short term we will load the Molecule CV into the Alliance (Ranjana and Michael P. will work on this)
 +
**Groups will switch to using common data model that works for all and common ontology/ontologies in the near future.
 +
* How do we handle genetic sex? Part of condition?
 +
** Condition has been intended for external/environmental conditions, whereas genetic sex is inherent to the organism of study
 +
** Expression pattern curation needs genetic sex; needs a model at the Alliance for capturing sex
  
==2011 Meetings==
 
  
[[WormBase-Caltech_Weekly_Calls_February_2011|February]]
+
== August 13, 2020 ==
  
 +
=== Species in Postgres and ACEDB/Datomic ===
 +
* Want to dump "Affected By Pathogen" fields in Phenotype OA and RNAi OA
 +
* Want to be sure that what gets dumped aligns with species loaded into ACEDB
 +
* Currently one species annotated not in WS277: Streptococcus gallolyticus subsp. gallolyticus
 +
* We currently have multiple Postgres tables for storing species lists:
 +
** pap_species_index (used by "Affected By Pathogen" fields, AFP); Kimberly uses to assign species to papers and occasionally adds new ones
 +
** obo_name_ncbitaxonid
 +
** obo_name_taxon (original, smaller list)
 +
** h_pap_species_index (history for pap_species_index)
 +
* How do species get loaded into ACEDB? Dumps from Postgres? Which table(s)?
 +
* WS277 has 7,906 species (1,936 have no NCBI Taxon ID)
 +
* Kimberly has occasionally uploaded a species.ace file in the context of GO curation; but Hinxton otherwise handles it; should ask them
 +
* New species are associated with paper objects, but otherwise no additional data for those species come from Caltech
 +
* It might be useful to have species pages in WB that at least list papers for which we have species associations, maybe include other information?
  
== March, 2011 ==
+
=== WS279 Citace upload ===
 +
* When is it happening? Not sure; not on release schedule right now
  
== March 3, 2011 ==
+
=== SOLR server security (IMSS) ===
 +
* IMSS network security blocked network on our server due to its open SOLR web access.
 +
* Part of AMIGO stack, very old version, drives our ontology browser directly, SObA, Enrichment tools indirectly.
 +
* Added some firewall/URL filter and IMSS opens up the network (for now). IMSS still gripes about its service is open to the world.
  
Delayed release cycle:
+
=== Alzheimer's disease portal ===
*Will require more work to prepare for more frequent release of certain data types
+
* Supplement grant awarded to Alliance for an Alzheimer's disease portal
*Aside from Kimberly's data, most data types are not urgent (e.g. Expression pattern)
+
* Could involve automated/concise descriptions, interactions, etc.
*What are the users feeling?
+
* Could establish useful pipelines that could be reused in other contexts
**Having data faster will help users; they don't ask, because they don't see it
 
*On-the-fly updating of website? Like Postgres?
 
*Since we use ACEDB, we have to patch WS with .ACE file, or rebuild whole thing
 
*Flat file Postgres database, replaced every night?
 
*Website calls Postgres directly for certain data types?
 
*Performing build without sequence is easy? Do everything without sequence?
 
*How to integrate sequence data with other data once they're decoupled through the patching process?
 
*We need .ACE patch files
 
*Concise description separate from most else (but connected to papers)
 
*Do papers first?
 
*Website can show anything
 
*If we have a lot of patches, will not have check for data inconsistency/confliction
 
*Trial patch .ace files for papers first
 
*Juancarlos: Scripts that check differences between data dumps; scripts are data type specific
 
**Curators need to talk to Juancarlos about the importance of different data tags
 
*Paper .ACE file: Would include bibliographic info, journals, authors, genes associated from abstract or added manually
 
*One reason for more frequent releases: because we have first pass author forms; show them we add it quickly
 
**what will be added through the forms: expression patterns? RNAi (difficult?)?
 
*We should check patch before we send to Todd!!! Don't want to crash database
 
*How frequently to patch? Weekly? Daily? Check with Todd, how often he can load them?
 
*Chron job to create patch ACE files, send to curators to check for problems, then send to Todd
 
*Interdependency of data types; curators rely on other curators?
 
*Postgres directly to website? Todd would have to work it out
 
*New information flag on website? Toggle visibility?
 
*How do we know that the data do not conflict with each other?
 
*What are common problems? Dumper script goes bad, makes broken lines, empty fields
 
*Error catching mechanisms? More checks on postgres? Dump files?
 
*Data merging problems? What are the cases that are conflicts? Prevent them? Know beforehand?
 
*If we don't know, as long as it doesn't crash the database or fail to load, then OK
 
*Don't do -D stuff, maybe? No deletions? Skip typos?
 
*Always have to check ACE files anyway, but have to do every week (2 weeks?)
 
*We can try a patch every other month
 
*What can we do without the patch?
 
*Did SAB talk about changing to relational databases?
 
**Get website going as is first, and see if it matters?
 
**If people don't want to change data models, we can switch over to relational
 
**Separate panel on website directly from Postgres?
 
*Wen can check the data integration every other month for patch
 

Revision as of 21:01, 13 August 2020

Previous Years

2009 Meetings

2011 Meetings

2012 Meetings

2013 Meetings

2014 Meetings

2015 Meetings

2016 Meetings

2017 Meetings

2018 Meetings

2019 Meetings



2020 Meetings

January

February

March

April

May

June

July


August 6th, 2020

Experimental conditions data flow into Alliance

  • Experimental conditions in disease annotations: WB has inducers (used to recapitulate the disease condition) and modifiers (a modifier can ameliorate, exacerbate, or have no effect, on the disease condition)
  • We use the WB Molecule CV for Inducers and Modifiers in disease annotation
  • Experimental conditions in phenotype annotations: are free text (captured in remarks); will probably need to formalize later on
  • So for data flow into Alliance:
    • In the short term we will load the Molecule CV into the Alliance (Ranjana and Michael P. will work on this)
    • Groups will switch to using common data model that works for all and common ontology/ontologies in the near future.
  • How do we handle genetic sex? Part of condition?
    • Condition has been intended for external/environmental conditions, whereas genetic sex is inherent to the organism of study
    • Expression pattern curation needs genetic sex; needs a model at the Alliance for capturing sex


August 13, 2020

Species in Postgres and ACEDB/Datomic

  • Want to dump "Affected By Pathogen" fields in Phenotype OA and RNAi OA
  • Want to be sure that what gets dumped aligns with species loaded into ACEDB
  • Currently one species annotated not in WS277: Streptococcus gallolyticus subsp. gallolyticus
  • We currently have multiple Postgres tables for storing species lists:
    • pap_species_index (used by "Affected By Pathogen" fields, AFP); Kimberly uses to assign species to papers and occasionally adds new ones
    • obo_name_ncbitaxonid
    • obo_name_taxon (original, smaller list)
    • h_pap_species_index (history for pap_species_index)
  • How do species get loaded into ACEDB? Dumps from Postgres? Which table(s)?
  • WS277 has 7,906 species (1,936 have no NCBI Taxon ID)
  • Kimberly has occasionally uploaded a species.ace file in the context of GO curation; but Hinxton otherwise handles it; should ask them
  • New species are associated with paper objects, but otherwise no additional data for those species come from Caltech
  • It might be useful to have species pages in WB that at least list papers for which we have species associations, maybe include other information?

WS279 Citace upload

  • When is it happening? Not sure; not on release schedule right now

SOLR server security (IMSS)

  • IMSS network security blocked network on our server due to its open SOLR web access.
  • Part of AMIGO stack, very old version, drives our ontology browser directly, SObA, Enrichment tools indirectly.
  • Added some firewall/URL filter and IMSS opens up the network (for now). IMSS still gripes about its service is open to the world.

Alzheimer's disease portal

  • Supplement grant awarded to Alliance for an Alzheimer's disease portal
  • Could involve automated/concise descriptions, interactions, etc.
  • Could establish useful pipelines that could be reused in other contexts