Load-balancing and automatic failover mechanisms are essential components of a high-availability system. Equally important is efficient detection of failing services and servers that may not fall under the auspices of the failover protocol. This may include monitoring the size of log files, the amount of disk space available, memory race conditions, and the availaibility of secondary services.
We use monit at WormBase to handle these tasks and more.
Monit is simple to install.
todd> cd ~/build todd> tar xzf ../src/monit*.tar.gz todd> cd monit* todd> ./configure todd> make todd> sudo make install
monit uses a fun free-text configuration file-format. Multiple instances of monit can be launched, each pointing to its own configuration file. By fun, I mean that it is much more fun that writing init scripts.
monit configuration files live at:
Test the monit configuration file
todd> monit -t -c /path/to/monitrc
Starting and stopping monit
Start monit by:
todd> monit -c /path/to/monitrc
Stop monit by:
todd> monit -c /path/to/monitrc quit
Configuring the system to run monit under init
We configure monit to run under init. To do this edit /etc/inittab with entries for each of the WormBase monitrc files:
todd> sudo nano /etc/inittab
// Add the following line as an example mo:2345:respawn:/usr/local/bin/monit -Ic /home/todd/monitrc
Force init to re-read the configuration file by
todd> sudo /sbin/telinit q
What WormBase monitors
Nothing can bring down a server faster than logs that eat up all available disk space or that grow to behemoth proportions. This is particularly true of the sgifaceserver log serverlog.wrm, a file that grows so fast it makes my head spin. If this file hits 2GB in size, sgifaceserver will crash.