Difference between revisions of "WormBase-Caltech Weekly Calls"

From WormBaseWiki
Jump to navigationJump to search
m
Line 8: Line 8:
  
 
== March, 2011 ==
 
== March, 2011 ==
 +
 +
== March 3, 2011 ==
 +
 +
Delayed release cycle:
 +
*Will require more work to prepare for more frequent release of certain data types
 +
*Aside from Kimberly's data, most data types are not urgent (e.g. Expression pattern)
 +
*What are the users feeling?
 +
**Having data faster will help users; they don't ask, because they don't see it
 +
*On-the-fly updating of website? Like Postgres?
 +
*Since we use ACEDB, we have to patch WS with .ACE file, or rebuild whole thing
 +
*Flat file Postgres database, replaced every night?
 +
*Website calls Postgres directly for certain data types?
 +
*Performing build without sequence is easy? Do everything without sequence?
 +
*How to integrate sequence data with other data once they're decoupled through the patching process?
 +
*We need .ACE patch files
 +
*Concise description separate from most else (but connected to papers)
 +
*Do papers first?
 +
*Website can show anything
 +
*If we have a lot of patches, will not have check for data inconsistency/confliction
 +
*Trial patch .ace files for papers first
 +
*Juancarlos: Scripts that check differences between data dumps; scripts are data type specific
 +
**Curators need to talk to Juancarlos about the importance of different data tags
 +
*Paper .ACE file: Would include bibliographic info, journals, authors, genes associated from abstract or added manually
 +
*One reason for more frequent releases: because we have first pass author forms; show them we add it quickly
 +
**what will be added through the forms: expression patterns? RNAi (difficult?)?
 +
*We should check patch before we send to Todd!!! Don't want to crash database
 +
*How frequently to patch? Weekly? Daily? Check with Todd, how often he can load them?
 +
*Chron job to create patch ACE files, send to curators to check for problems, then send to Todd
 +
*Interdependency of data types; curators rely on other curators?
 +
*Postgres directly to website? Todd would have to work it out
 +
*New information flag on website? Toggle visibility?
 +
*How do we know that the data do not conflict with each other?
 +
*What are common problems? Dumper script goes bad, makes broken lines, empty fields
 +
*Error catching mechanisms? More checks on postgres? Dump files?
 +
*Data merging problems? What are the cases that are conflicts? Prevent them? Know beforehand?
 +
*If we don't know, as long as it doesn't crash the database or fail to load, then OK
 +
*Don't do -D stuff, maybe? No deletions? Skip typos?
 +
*Always have to check ACE files anyway, but have to do every week (2 weeks?)
 +
*We can try a patch every other month
 +
*What can we do without the patch?
 +
*Did SAB talk about changing to relational databases?
 +
**Get website going as is first, and see if it matters?
 +
**If people don't want to change data models, we can switch over to relational
 +
**Separate panel on website directly from Postgres?
 +
*Wen can check the data integration every other month for patch

Revision as of 18:28, 3 March 2011

2009 Meetings


2011 Meetings

February


March, 2011

March 3, 2011

Delayed release cycle:

  • Will require more work to prepare for more frequent release of certain data types
  • Aside from Kimberly's data, most data types are not urgent (e.g. Expression pattern)
  • What are the users feeling?
    • Having data faster will help users; they don't ask, because they don't see it
  • On-the-fly updating of website? Like Postgres?
  • Since we use ACEDB, we have to patch WS with .ACE file, or rebuild whole thing
  • Flat file Postgres database, replaced every night?
  • Website calls Postgres directly for certain data types?
  • Performing build without sequence is easy? Do everything without sequence?
  • How to integrate sequence data with other data once they're decoupled through the patching process?
  • We need .ACE patch files
  • Concise description separate from most else (but connected to papers)
  • Do papers first?
  • Website can show anything
  • If we have a lot of patches, will not have check for data inconsistency/confliction
  • Trial patch .ace files for papers first
  • Juancarlos: Scripts that check differences between data dumps; scripts are data type specific
    • Curators need to talk to Juancarlos about the importance of different data tags
  • Paper .ACE file: Would include bibliographic info, journals, authors, genes associated from abstract or added manually
  • One reason for more frequent releases: because we have first pass author forms; show them we add it quickly
    • what will be added through the forms: expression patterns? RNAi (difficult?)?
  • We should check patch before we send to Todd!!! Don't want to crash database
  • How frequently to patch? Weekly? Daily? Check with Todd, how often he can load them?
  • Chron job to create patch ACE files, send to curators to check for problems, then send to Todd
  • Interdependency of data types; curators rely on other curators?
  • Postgres directly to website? Todd would have to work it out
  • New information flag on website? Toggle visibility?
  • How do we know that the data do not conflict with each other?
  • What are common problems? Dumper script goes bad, makes broken lines, empty fields
  • Error catching mechanisms? More checks on postgres? Dump files?
  • Data merging problems? What are the cases that are conflicts? Prevent them? Know beforehand?
  • If we don't know, as long as it doesn't crash the database or fail to load, then OK
  • Don't do -D stuff, maybe? No deletions? Skip typos?
  • Always have to check ACE files anyway, but have to do every week (2 weeks?)
  • We can try a patch every other month
  • What can we do without the patch?
  • Did SAB talk about changing to relational databases?
    • Get website going as is first, and see if it matters?
    • If people don't want to change data models, we can switch over to relational
    • Separate panel on website directly from Postgres?
  • Wen can check the data integration every other month for patch