Difference between revisions of "Design Specs: Database Connectivity"
(26 intermediate revisions by one other user not shown) | |||
Line 5: | Line 5: | ||
Should be fault tolerant if a server or database crashes. | Should be fault tolerant if a server or database crashes. | ||
− | == | + | ==Implementation== |
− | API/Role/Service.pm | + | http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Role/Service.pm |
+ | a Moose Role module which defines the common attributes and behavior for different databases. | ||
eg. | eg. | ||
− | ' | + | 'hosts' array reference attribute which has all the available hosts from configuration file. |
− | + | reconnect: abstract method for the database connection, fail-over is also implementation here: | |
− | + | *use Moose 'around' method modifier to check on the db handle every time it is accessed | |
+ | |||
+ | around 'dbh' sub {} | ||
+ | |||
+ | Note that attribute accessors also count as methods for the purposes of satisfying the requirements of a role. | ||
+ | |||
+ | *do "reconnect" if dbh is not set, undef or time-out(AceDB) | ||
+ | * will try to connect for maximum $ReconnectMaxTries(now this is hard coded as 5, will replace by conf) times | ||
+ | * host is picked randomly among the pool | ||
+ | |||
+ | my $host = $self->hosts->[ rand @{$self->hosts} ]; | ||
− | This Role is consumed by each Database Classes(Service module) API/Service/acedb.pm | + | This Role is consumed by each Database Classes(Service module) |
+ | *http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Service/acedb.pm | ||
+ | *http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Service/gff.pm | ||
− | + | each has its own "connect" method and "ping" method(mostly just a wrapper) | |
<pre> | <pre> | ||
AceDB | AceDB | ||
use Ace; | use Ace; | ||
− | Ace->connect | + | sub connect { |
+ | my $self = shift; | ||
+ | my ($host,$port,$user,$pass)=@_; | ||
+ | return Ace->connect(-host => $host, | ||
+ | -port => $self->port, | ||
+ | -user=>$self->user, | ||
+ | -pass=>$self->pass, | ||
+ | ); | ||
+ | # @cache); | ||
+ | } | ||
GFF database | GFF database | ||
use Bio::DB::GFF; | use Bio::DB::GFF; | ||
− | Bio::DB::GFF->new( | + | sub connect { |
+ | my $self = shift; | ||
+ | my ($host,$port,$user,$pass)=@_; | ||
+ | return Bio::DB::GFF->new( -user => $user, | ||
+ | -pass => $pass, | ||
+ | -dsn => "dbi:mysql:database=".$self->species.";host=" . $host, | ||
+ | ); | ||
+ | } | ||
</pre> | </pre> | ||
+ | |||
+ | Minor changes in main API http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API.pm | ||
+ | * took out the BUILD subroutine | ||
+ | * modified gff_dsn() method | ||
+ | |||
+ | ==Testing== | ||
+ | |||
+ | prove -lv t/WormBase/API/Object/Gene.t http://bitbucket.org/tharris/wormbase/src/tip/t/WormBase/API/Object/Gene.t | ||
+ | |||
+ | <pre> | ||
+ | ok 1 - use WormBase::API; | ||
+ | ok 2 - Constructed WormBase::API object ok | ||
+ | [DEBUG] acedb dbh for specie doesn't exist or is not alive; trying to connect | ||
+ | [INFO] trytime 1: Connecting to acedb | ||
+ | [DEBUG] using the following parameters: | ||
+ | [DEBUG] aceserver.cshl.edu:2005 | ||
+ | [INFO] --> succesfully established connection to acedb on aceserver.cshl.edu | ||
+ | [DEBUG] acedb dbh for specie exists and is alive! | ||
+ | ok 3 - The object isa WormBase::API::Object::Gene | ||
+ | ... | ||
+ | ... | ||
+ | [DEBUG] gff dbh for specie c_elegans doesn't exist or is not alive; trying to connect | ||
+ | [INFO] trytime 1: Connecting to gff | ||
+ | [DEBUG] using the following parameters: | ||
+ | Use of uninitialized value in concatenation (.) or string at /home/xshi/work/project/wormbase/website-2.0/lib/WormBase/API/Role/Service.pm line 106. | ||
+ | [DEBUG] aceserver.cshl.edu: | ||
+ | [INFO] --> succesfully established connection to gff on aceserver.cshl.edu | ||
+ | [DEBUG] gff dbh for specie c_elegans exists and is alive! | ||
+ | [DEBUG] gff dbh for specie c_elegans exists and is alive! | ||
+ | ... | ||
+ | </pre> | ||
+ | |||
+ | ==Question== | ||
+ | When Ace time out would the dbh becomes undef? | ||
+ | How useful is the Ace reopen method? For now I just re-connect. | ||
+ | |||
+ | ==Continued Work== | ||
+ | |||
+ | setting up attributes such as hosts, user, password, caching for AceDB etc through Configuration file. | ||
+ | |||
+ | Catalyst::Plugin::ConfigLoader | ||
+ | |||
+ | |||
+ | use Ace (); to prevent eg. the connect sub from Ace being imported into our namespaces. | ||
==Things to think about== | ==Things to think about== | ||
Line 59: | Line 132: | ||
Currently, each database is a "Service". When the API is instantiated, connections are made to each required service. Database handles are cached. When a database handle is needed, the service first checks to see if it is live, if not it refreshes the connection. This would probably be the ideal location to implement failover -- check the current connection, if dead, try a different database server. | Currently, each database is a "Service". When the API is instantiated, connections are made to each required service. Database handles are cached. When a database handle is needed, the service first checks to see if it is live, if not it refreshes the connection. This would probably be the ideal location to implement failover -- check the current connection, if dead, try a different database server. | ||
+ | *Note that attribute accessors also count as methods for the purposes of satisfying the requirements of a role. | ||
http://bitbucket.org/tharris/wormbase/src/9009075c4c5a/lib/WormBase/API/Role/Service/gff.pm | http://bitbucket.org/tharris/wormbase/src/9009075c4c5a/lib/WormBase/API/Role/Service/gff.pm | ||
Line 92: | Line 166: | ||
This nomenclature and structure is a bit confusing and probably needs to be streamlined/refined!! | This nomenclature and structure is a bit confusing and probably needs to be streamlined/refined!! | ||
+ | |||
+ | |||
+ | |||
+ | [[Category:Developer documentation]] |
Latest revision as of 18:48, 11 August 2010
Contents
Objectives
Part of the WormBase::API. Needs to be able to connect to multiple databases (multiple instances of the same database in some cases). Should be fault tolerant if a server or database crashes.
Implementation
http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Role/Service.pm a Moose Role module which defines the common attributes and behavior for different databases.
eg.
'hosts' array reference attribute which has all the available hosts from configuration file.
reconnect: abstract method for the database connection, fail-over is also implementation here:
- use Moose 'around' method modifier to check on the db handle every time it is accessed
around 'dbh' sub {}
Note that attribute accessors also count as methods for the purposes of satisfying the requirements of a role.
- do "reconnect" if dbh is not set, undef or time-out(AceDB)
- will try to connect for maximum $ReconnectMaxTries(now this is hard coded as 5, will replace by conf) times
- host is picked randomly among the pool
my $host = $self->hosts->[ rand @{$self->hosts} ];
This Role is consumed by each Database Classes(Service module)
- http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Service/acedb.pm
- http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Service/gff.pm
each has its own "connect" method and "ping" method(mostly just a wrapper)
AceDB use Ace; sub connect { my $self = shift; my ($host,$port,$user,$pass)=@_; return Ace->connect(-host => $host, -port => $self->port, -user=>$self->user, -pass=>$self->pass, ); # @cache); } GFF database use Bio::DB::GFF; sub connect { my $self = shift; my ($host,$port,$user,$pass)=@_; return Bio::DB::GFF->new( -user => $user, -pass => $pass, -dsn => "dbi:mysql:database=".$self->species.";host=" . $host, ); }
Minor changes in main API http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API.pm
- took out the BUILD subroutine
- modified gff_dsn() method
Testing
prove -lv t/WormBase/API/Object/Gene.t http://bitbucket.org/tharris/wormbase/src/tip/t/WormBase/API/Object/Gene.t
ok 1 - use WormBase::API; ok 2 - Constructed WormBase::API object ok [DEBUG] acedb dbh for specie doesn't exist or is not alive; trying to connect [INFO] trytime 1: Connecting to acedb [DEBUG] using the following parameters: [DEBUG] aceserver.cshl.edu:2005 [INFO] --> succesfully established connection to acedb on aceserver.cshl.edu [DEBUG] acedb dbh for specie exists and is alive! ok 3 - The object isa WormBase::API::Object::Gene ... ... [DEBUG] gff dbh for specie c_elegans doesn't exist or is not alive; trying to connect [INFO] trytime 1: Connecting to gff [DEBUG] using the following parameters: Use of uninitialized value in concatenation (.) or string at /home/xshi/work/project/wormbase/website-2.0/lib/WormBase/API/Role/Service.pm line 106. [DEBUG] aceserver.cshl.edu: [INFO] --> succesfully established connection to gff on aceserver.cshl.edu [DEBUG] gff dbh for specie c_elegans exists and is alive! [DEBUG] gff dbh for specie c_elegans exists and is alive! ...
Question
When Ace time out would the dbh becomes undef? How useful is the Ace reopen method? For now I just re-connect.
Continued Work
setting up attributes such as hosts, user, password, caching for AceDB etc through Configuration file.
Catalyst::Plugin::ConfigLoader
use Ace (); to prevent eg. the connect sub from Ace being imported into our namespaces.
Things to think about
persistent connections
Apache::DBI DBI connect_cached DBIx::Class
Also, because Apache::DBI's connection caching never checks to see if the process and/or thread id changed, it will give you an invalid $dbh from a previous process/thread if you use a connection, then thread/fork, then try to re-connect to the same source, which is a Bad Thing.
DBIx::Class already manages its connections for you, and therefore it cannot benefit from Apache::DBI under any scenario. It makes one connection per-process, and keeps that connection persistent, reconnecting only if the connection appears to have died, or if you fork/thread over to another process/thread-id. The only Apache::DBI issue in DBIx::Class is that Apache::DBI will actually thwart DBIx::Class's connection management code, and cause it to use the same (and invalid) connection in a new process, in cases such as (as mentioned above) if you make a DBI connection before forking in a prefork mod_perl server.
Resources
The current connectivity code is part of the API. The API, in turn, is inside of the lib directory of the WormBase mercurial repository:
http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Service/
Currently, each database is a "Service". When the API is instantiated, connections are made to each required service. Database handles are cached. When a database handle is needed, the service first checks to see if it is live, if not it refreshes the connection. This would probably be the ideal location to implement failover -- check the current connection, if dead, try a different database server.
- Note that attribute accessors also count as methods for the purposes of satisfying the requirements of a role.
http://bitbucket.org/tharris/wormbase/src/9009075c4c5a/lib/WormBase/API/Role/Service/gff.pm
around 'dbh' => sub { my $orig = shift; my $self = shift; my $species = $self->species; # Do we already have a dbh? HOW TO TEST THIS WITH HASH REF? if ($self->has_dbh) { $self->log->debug(" gff-dbh for $species exists and is alive!"); return $self->$orig; } else { $self->log->debug(" gff-dbh for $species doesn't exist yet; trying to connect"); my $dbh = $self->connect($species); } };
Note also that these Services use Moose "Roles" to define shared methods and variables like connect().
Here's the acedb service:
http://bitbucket.org/tharris/wormbase/src/tip/lib/WormBase/API/Service/acedb.pm
The acedb service "consumes" the generic Service role:
http://bitbucket.org/tharris/wormbase/src/9009075c4c5a/lib/WormBase/API/Role/Service.pm
This nomenclature and structure is a bit confusing and probably needs to be streamlined/refined!!