GBrowse Administration
Contents
- 1 Overview
- 2 Hardware Topology
- 3 Basic Server Configuration
- 3.1 Create groups
- 3.2 Create directories
- 3.3 Install prerequisites (Debian)
- 3.4 Install Perl 5.10.1 or greater
- 3.5 Install Apache2
- 3.6 Use mod_fcgid
- 3.7 Install Fast CGI
- 3.8 MySQL
- 3.9 Install Requisite Perl Modules
- 3.10 Install SamTools
- 3.11 Install Bio-BigFile
- 3.12 Install GBrowse2
- 3.13 Fetch configuration and support files
- 3.14 Configure the master server
- 3.15 GBrowse slave servers
- 4 Administration
- 5 Installing GBrowse 1.x
- 6 References
- 7 Author
Overview
GBrowse is one of the central -- and most heavily used -- components of the WormBase site. To meet this demand, we use a number of dedicated servers using GBrowse's built in renderfarm capability. This document describes how to build and configure these servers.
Hardware Topology
The GBrowse topology looks like this:
Master ------------- ----------------------------- | gb1 | \ \ | | ------------- ------------- | | | gb2 | | gb3 | Slaves |* * * * * *| |* * * * * *| |* * * * * *| ------------- ------------- ------------- * * * * * = relational databases running on each node.
Note that although this topology presents a single master with multiple slave servers, in reality each node is redundant with each other. This makes failover easier in case the master server dies. See the Administration & Troubleshooting section for details.
Basic Server Configuration
Create groups
sudo addgroup wormbase sudo addgroup gbrowse # Add yourself to the groups sudo usermod -a -G gbrowse, wormbase tharris
Create directories
Create a directory for GBrowse. For now, we're maintaining GBrowse on dedicated hardware completely distinct from the rest of the web application. This makes updates significantly easier, although it increases maintenance (ie boilerplate headers and footers are maintained in two places). We'll also keep gbrowse inside a wormbase/ directory should we wish to install other components on these servers at a later date.
$ sudo mkdir -p /usr/local/wormbase $ sudo chgrp -R wormbase /usr/local/wormbase $ sudo chmod 2775 /usr/local/wormbase
$ mkdir -p logs gbrowse/support-files build extlib $ chmod 666 logs
Install prerequisites (Debian)
// Update apt $ sudo apt-get update
// Make tools $ sudo apt-get install gcc make
// expat $ sudo apt-get install expat libexpat1 libexpat1-dev
// GD support $ sudo apt-get install libgd2-noxpm libgd2-noxpm-dev
// Graphviz, now apparently required by BioPerl $ sudo apt-get install graphviz
Install Perl 5.10.1 or greater
You'll need Perl version 5.10.1 or greater.
Debian:
$ sudo apt-get install perl
NOTE: This version of Perl on Debian Lenny doesn't ship with perl.h. You won't be able to build mod_perl without it! You should probably build from source.
Building from source:
bash> cd ~/src bash> curl -O http://www.cpan.org/src/perl-5.10.1.tar.gz bash> cd ~/build bash> tar xzf ../src/perl-5.10.1.tar.gz bash> cd perl-5.10.1 bash> ./Configure -des // Or, to install in a local path: bash> ./Configure -des -Dprefix=$HOME/website/perl/5.10.1 bash> make bash> make test bash> sudo make install
Install Apache2
Technically, none of the slave renderers require Apache. We install it anyways just in case the primary master server dies.
If you're running Debian, get rid of the installed apache. The default layout is just plain annoying.
$ sudo apt-get remove apache2
Assuming you have already fetched the source into ~/src:
# Build httpd 2.2.11 cd ~/src tar xzf httpd-2.2.11.tar.gz cd httpd-2.2.11 ./configure --enable-mods-shared=all --enable-proxy make sudo make install
Apache Configuration
Edit the primary httpd.conf file (/usr/local/apache2/conf/httpd.conf) with the appropriate port, virtual host, and fcgi settings:
- Set httpd to listen on your desired port. In the examples below, we assume port 8080.
# Edit "Listen 80" to read Listen 8080
Enable virtual hosts by uncommenting out the following line:
#Include conf/extra/httpd_hosts.conf
Enable FastCGI module (installed later)
LoadModule fastcgi_module modules/mod_fastcgi.so
Add FastCGI config from GBrowse (see the GBrowse section below for details). Note that we tweak the Location Aliases to place the fast CGI-enabled GBrowse under /db/gb2.
<IfModule mod_fcgid.c> Alias /db/gb2 "/usr/local/wormbase/gbrowse2/current/cgi/gb2" <Location /db/gb2> SetHandler fcgid-script Options ExecCGI </Location> DefaultInitEnv GBROWSE_CONF /usr/local/wormbase/gbrowse2/current/conf DefaultInitEnv PERL5LIB /usr/local/wormbase/extlib/lib/perl5/x86_64-linux-gnu-thread-multig:/usr/local/wormbase/extlib/lib/perl5:/usr/local/wormbase/gbrowse2/current/lib/perl5:/usr/local/wormbase/gbrowse2/support-files/lib:/usr/local/wormbase/gbrowse2/support-files/lib/Bio:/usr/local/wormbase/gbrowse2/support-files/lib/Bio/Graphics:/usr/local/wormbase/gbrowse2/support-files/lib/Bio/Graphics/Glyph </IfModule> <IfModule mod_fastcgi.c> Alias /db/gb2 "/usr/local/wormbase/gbrowse2/current/cgi/gb2" <Location /db/gb2> SetHandler fastcgi-script Options ExecCGI </Location> FastCgiConfig -initial-env PERL5LIB=/usr/local/wormbase/extlib/lib/perl5/x86_64-linux-gnu-thread-multig:/usr/local/wormbase/extlib/lib/perl5:/usr/local/wormbase/gbrowse2/current/lib/perl5:/usr/local/wormbase/gbrowse2/support-files/lib:/usr/local/wormbase/gbrowse2/support-files/lib/Bio:/usr/local/wormbase/gbrowse2/support-files/lib/Bio/Graphics:/usr/local/wormbase/gbrowse2/support-files/lib/Bio/Graphics/Glyph </IfModule>
Set up a virtual host on your port by editing /usr/local/apache2/conf/extras/httpd-vhosts.conf:
<VirtualHost *:80> ErrorLog "/usr/local/wormbase/logs/error_log" CustomLog "/usr/local/wormbase/logs/access_log" common # Everything for GBrowse2 will be served from under a url of either # /db/gb2 or /gbrowse2 # This makes proxying from CSHL more convenient. # Note that /db is historical but required for continuity. Alias "/gbrowse2/i/" "/usr/local/wormbase/gbrowse2/tmp/images/" # NOTE: this MUST be different from the ScriptAlias. Alias "/gbrowse2" "/usr/local/wormbase/gbrowse2/current/html" ScriptAlias "/db/gb2" "/usr/local/wormbase/gbrowse2/current/cgi/gb2" # Support files: the wormbase css file for gbrowse, banner image, etc # Map a URL to a directory. Here we use gb2-support to distinguish it from # the gb1 support files directory. Alias "/gb2-support/ "/usr/local/wormbase/gbrowse2/current/support-files/" <Directory "/usr/local/wormbase/gbrowse2/support-files"> Options -Indexes -MultiViews +FollowSymLinks Order allow,deny Allow from all </Directory> # The temporary GBrowse directory <Directory "/usr/local/wormbase/gbrowse2/tmp"> Allow from all </Directory> <Directory "/usr/local/wormbase/gbrowse2/current/html"> Options -Indexes -MultiViews +FollowSymLinks Order allow,deny Allow from all </Directory> <Directory "/usr/local/wormbase/gbrowse2/current/cgi/gb2"> SetEnv PERL5LIB "/usr/local/wormbase/extlib/lib/perl5/x86_64-linux-gnu-thread-multi:/usr/local/wormbase/extlib/lib/perl5:/usr/local/wormbase/gbrowse2/current/lib/perl5" SetEnv GBROWSE_CONF "/usr/local/wormbase/gbrowse2/current/conf" </Directory> </VirtualHost>
Set up httpd to run under inet.d
Only required for the master server.
Remove /etc/init.d/apache2. It just confuses things.
Set which runlevels httpd will run under by chkconfig or command line:
chkconfig --add httpd chkconfig --level 2345 httpd on chkconfig --list
sudo cp /usr/local/apache2/bin/apachectl /etc/init.d/httpd cd /etc/rc3.d sudo ln -s ../init.d/httpd S90httpd cd /etc/rc5.d sudo ln -s ../init.d/httpd S90httpd
Use mod_fcgid
It has performance advantage over mod_fastcgi(especially after bringing modEncode data) See Lincoln's notes:
http://gmod.org/wiki/Recompiling_mod_fcgid_to_avoid_truncated_Perl_library_paths
Install Fast CGI
$ cd /usr/local/wormbase/build $ wget http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz $ tar xzf mod_fastcgi.*.tar.gz $ cd mod_fastcgi* $ cp Makefile.AP2 Makefile $ make $ sudo make install
MySQL
Each node in the renderfarm has its own MySQL databases.
Installation (>= 5.1)
We need a newer version of mysql than is available from the package manager. We'lll install the binary for x86_64.
$ sudo groupadd mysql $ sudo useradd -g mysql mysql $ sudo usermod -a -G mysql tharris $ cd /usr/local $ sudo gunzip < /path/to/mysql-VERSION-OS.tar.gz | sudo tar xvf - $ sudo ln -s full-path-to-mysql-VERSION-OS mysql $ cd mysql $ sudo chown -R mysql . $ sudo chgrp -R mysql . $ sudo scripts/mysql_install_db --user=mysql // may be necessary to remove /etc/mysql/my.cnf first $ sudo chown -R root . $ sudo chown -R mysql data $ sudo chmod 2775 data // I like to be able to write to my datadir
// Copy the configuration file $ sudo cp support_files/my-medium.cnf /etc/my.cnf
// It may be necessary to add the following to my.cnf if back end nodes // don't have fully qualified domain names skip-name-resolve
// Set up mysql to start up automatically $ sudo cp support_files/mysql.server /etc/init.d/. $ cd /etc/rc3.d $ sudo ln -s ../init.d/mysql.server S90mysql $ cd ../rc5.d $ sudo ln -s ../init.d/mysql.server S90mysql
// Start it up and set a root password. $ sudo /etc/init.d/mysql.server start $ mysqladmin -u root password 'PASSWORD'
Databases
MySQL databases will be mirrored over directly as part of the production update process. They should be located in /usr/local/mysql/data. Symlinks, using the symbolic name of the database, point to the current version:
c_elegans -> c_elegans_WS211
Privileges
Be sure to grant privileges to both master and slave servers to the databases:
$ mysql -u root -p -e 'grant select on c_elegans.* to nobody@localhost' $ mysql -u root -p -e 'grant select on c_elegans.* to nobody@render-slave-ip'
Install Requisite Perl Modules
All libraries specific for GBrowse are maintained in a local shared directory. This makes it easier and faster to upgrade GBrowse, but more difficult to test multiple versions simultaneously. To test a new version of GBrowse, place extlib inside of the gbrowse-current directory and modify the apache configuration and commands below as appropriate.
Install local::lib to make things easier:
$ sudo perl -MCPAN -e 'CPAN::install(local::lib)' $ cd /usr/local/wormbase/extlib
To install modules to this path:
$ perl -Mlocal::lib=. $ eval $(perl -Mlocal::lib=./) $ sudo rm -rf ~/.cpan $ perl -MCPAN -e shell cpan> install Bio::Perl // may be necessary to install from source cpan> install Bio::Graphics // may be necessary to install from source
And for GBrowse:
cpan> install DBI cpan> install DBD::mysql cpan> install CGI::Session cpan> install JSON cpan> install Storable cpan> install FreezeThaw cpan> intsall FCGI cpan> install SVG GD::SVG cpan> install Text::Shellwords
Optional:
cpan> Crypt::SSLeay cpan> DBD::Pg cpan> DB_File::Lock cpan> File::NFSLock cpan> Net::OpenID::Consumer cpan> Net::SMTP::SSL cpan> Safe::World
Install SamTools
douwnload samtools and save under the build directory eg: /usr/local/wormbase/build/samtools-0.1.7a/
$ cd /usr/local/wormbase/build/samtools-0.1.7a/ $ make CXXFLAGS=-fPIC CFLAGS=-fPIC CPPFLAGS=-fPIC $ sudo cp /usr/local/wormbase/build/samtools-0.1.7a/misc/*.pl /usr/local/bin/
douwnload Bio-SamTools and save under the build directory eg:/usr/local/wormbase/build/Bio-SamTools-1.16/
$ export SAMTOOLS=/usr/local/wormbase/build/samtools-0.1.7a/ $ perl Build.PL --install_base=/usr/local/wormbase/extlib/ $ ./Build $ ./Build test $ (sudo) ./Build install
Install Bio-BigFile
To compile this module, you must first download, unpack and compilethe Jim Kent source tree, located at
http://hgdownload.cse.ucsc.edu/admin/jksrc.zip. Please follow the instructions contained in the source
tree to create the main library file jkweb.a
$ cd /usr/local/wormbase/build/kent/src/lib $ export MACHTYPE=x86_64 $ mkdir $MACHTYPE $ make CXXFLAGS=-fPIC CFLAGS=-fPIC CPPFLAGS=-fPIC
douwnload Bio-BigFile and save under the build directory eg: /usr/local/wormbase/build/Bio-BigFile-1.02
$ export KENT_SRC=/usr/local/wormbase/build/kent/src/ $ perl Build.PL --install_base=/usr/local/wormbase/extlib/ $ ./Build $ ./Build test $ (sudo) ./Build install
Install GBrowse2
$ mkdir /usr/local/wormbase/gbrowse2/gbrowse-version // ie gbrowse-2.03 $ ln -s gbrowse-version current $ perl ./Build.PL --install_base /usr/local/wormbase/gbrowse2/current $ perl ./Build --reconfig Directory for GBrowse's config and support files? /usr/local/wormbase/gbrowse2/support-files/conf Directory for GBrowse's static images & HTML files? /usr/local/wormbase/gbrowse2/current/html Directory for GBrowse's temporary data /usr/local/wormbase/gbrowse2/current/tmp Directory for GBrowse's example databases /usr/local/wormbase/gbrowse2/current/databases Directory for GBrowse's CGI script executables? /usr/local/wormbase/gbrowse2/current/cgi/gb2 Internet port to run demo web site on (for demo)? [8000] Apache loadable module directory (for demo)? [/usr/local/apache2/modules] User account under which Apache daemon runs? [www-data] daemon Automatically update Apache config files to run GBrowse? [n] Automatically update system config files to run gbrowse-slave? [n] y
To fetch suggested apache configuration:
./Build apache_conf
To install:
./Build install
You may need to fix permissions:
chmod 777 /usr/local/wormbase/gbrowse2/tmp
Install gbrowse-slave components on all servers:
sudo ./Build install_slave
Edit /etc/default/gbrowse_slave:
# Make sure that Perl5Lib includes that path to our external aggregator modules: export PERL5LIB=/usr/local/wormbase/extlib/lib/perl5/x86_64-linux-gnu-thread-multi:\ /usr/local/wormbase/extlib/lib/perl5:\ /usr/local/wormbase/gbrowse2/current/lib/perl5:\ /usr/local/wormbase/gbrowse2/current/support-files/lib:\ /usr/local/wormbase/gbrowse2/current/support-files/lib/Bio:\ /usr/local/wormbase/gbrowse2/current/support-files/lib/Bio/Graphics:\ /usr/local/wormbase/gbrowse2/current/support-files/lib/Bio/Graphics/Glyph
DAEMON=/usr/local/bin/gbrowse_slave USER=daemon // Be certain this matches the httpd daemon user on the master! #PRELOAD=/etc/gbrowse2/slave_preload.conf RUNDIR=/usr/local/wormbase/logs LOGDIR=/usr/local/wormbase/logs PORT="8101 8102 8103" VERBOSITY=3 NICE=0
Then set up the slave server to start automatically:
$ cd /etc/rc3.d $ sudo ln -s ../init.d/gbrowse-slave S99gbrowse-slave $ cd /etc/rc5.d $ sudo ln -s ../init.d/gbrowse-slave S99gbrowse-slave
Fetch configuration and support files
Configuration and all related gbrowse support files are maintained in mercurial.
http://bitbucket.org/tharris/wormbase-gbrowse2
This includes conf, html, libs, css and some images. Apache2 configuration contains directives that map URIs to files in both the conf and support-files directories. In addition, the WormBase instance of GBrowse requires a few custom glyphs that seem to have disappeared in GB2. These are all contained in the support-files/lib directory.
lib/Bio/Graphics/Glyph/full_transcript.pm lib/Bio/Graphics/Glyph/wormbase_transcript.pm lib/Bio/DB/GFF/Aggregator/ blastz_alignment_briggsae.pm blastz_alignment_elegans.pm waba_alignment_briggsae.pm waba_alignment_elegans.pm waba_alignment.pm
Fetch the mercurial repository and rename it:
cd /usr/local/wormbase/gbrowse2 hg clone http://bitbucket.org/tharris/wormbase-gbrowse2 mv wormbase-gbrowse2 support-files
Note: Some of the files in the support-files directory may have already been installed by the GBrowse installer. You'll need to either replace them or update them in the repository
Configure the master server
The configuration file of the master server should be set up with appropriate render slaves in the TRACK DEFAULTS section of each genome-level configuration file.
# wb-www1: XXX.XXX.XXX.173 (master) # wb-www2: XXX.XXX.XXX.174 (slave) # wb-www3: XXX.XXX.XXX.175 (slave)
remote renderer = http://XXX.XXX.XXX.173:8101 http://XXX.XXX.XXX.173:8102 http://XXX.XXX.XXX.173:8103 http://XXX.XXX.XXX.174:8101 http://XXX.XXX.XXX.174:8102 http://XXX.XXX.XXX.174:8103 http://XXX.XXX.XXX.175:8101 http://XXX.XXX.XXX.175:8102 http://XXX.XXX.XXX.175:8103
Define dsn arguments (here we specify that the MySQL databases are running on the localhost).
db_args = -dsn DBI:mysql:c_elegans;user=nobody;host=localhost
# Here's a second example with all databases running on the master server. db_args = -dsn DBI:mysql:c_elegans;user=nobody;host=XXX.XXX.XXX.173
GBrowse slave servers
To add an additional slave server, simply follow the steps above, adding the name of the server to the TRACK DEFAULTS section of the genome-level configuration file.
No special configuration is required for GBrowse slave servers. Simply ensure that the slave has access to the MySQL databases using the username and password supplied in the primary configuration file.
Make sure the slave servers are running by:
sudo /etc/init.d/gbrowse_slave start
Administration
Updating GBrowse
To update GBrowse:
- Check out the source to /usr/local/wormbase/build
- Follow the directions above to install GBrowse, resetting the current symlink
- Check out the "wormbase-gbrowse2" configuration repository
Maintenance
Purging Temporary Files
GBrowse generates a slew of temporary files. Under production, these temporary files can quickly exhaust the ulimit of a machine. Set up the following cronjob to purge them periodically.
1 1 * * * /usr/local/wormbase/gbrowse-support-files/purge_temporary_files.cron.sh
The contents of this script are:
cd /usr/local/wormbase/gbrowse-current/tmp find . -type f -atime +20 -print -exec rm {} \;
Log Rotation
gbrowse-support-files/
contains two configuration files for the unix logrotate
command. For the master server copy logrotate-gbrowse-httpd.conf
and logrotate-gbrowse-slave.conf
to /etc/logrotate.d/gbrowse-httpd and gbrowse-slave
. The slave nodes obviously only require the gbrowse-slave file.
The contents of these files are included here for reference:
# logrotate-gbrowse-httpd.conf # GBrowse httpd logrotate conf file /usr/local/wormbase/logs/*.log { daily missingok rotate 7 compress delaycompress notifempty create 640 root adm sharedscripts postrotate /usr/local/apache2/bin/apachectl graceful endscript } # logrotate-gbrowse-slave.conf # GBrowse slave files /usr/local/wormbase/logs/*_slave { daily missingok rotate 7 compress delaycompress notifempty sharedscripts postrotate /etc/init.d/gbrowse-slave stop /etc/init.d/gbrowse-slave start endscript }
Establishing a new slave node
Here's how to build a new slave node:
1. Set up directories, users, groups 2. Install apache, fastcgi, mysql 3. Copy /usr/local/wormbase from an existing node
Installing GBrowse 1.x
The GBrowse1 installation on our GBrowse nodes mirrors that for gbrowse2.
/usr/local/wormbase/gbrowse1 current -> gbrowse-1.70/ extlib/ html/ gbrowse/ conf/ support-files/ tmp/
Everything in the current/ directory is maintained in mercurial as wormbase-gbrowse1:
http://bitbucket.org/tharris/wormbase-gbrowse1
Here's how (basically):
cd /usr/local/wormbase/gbrowse1 mkdir gbrowse-1.xx ; cd gbrowse-1.xx mkdir extlib mkdir tmp chown nobody:daemon tmp perl -Mlocal::lib=./ eval $(perl -Mlocal::lib=./) perl -MCPAN -e shell cpan> install Bio::Perl cpan> install Bio::Graphics cpan> ... etc, etc
Unpack the gbrowse source and use the following settings (remember: adjust the "current" symlink first).
CGIBIN=/usr/local/wormbase/gbrowse1/current/cgi DO_XS=1 APACHE=/usr/local/apache2 GBROWSE_ROOT=gbrowse CONF=/usr/local/wormbase/gbrowse1/current/conf HTDOCS=/usr/local/wormbase/gbrowse1/current/html VERSION=1.70 BIOGRAPHICS_VERSION=1.8
Check out the GBrowse1 module from mercurial. This contains all configuration and support files that we might possibly need to alter.
cd /usr/local/wormbase/gbrowse1/current mkdir temp mv * temp/ // Checkout wormbase-gbrowse1 from bitbucket // move files from checked out directory to .
References
- The official GBrowse Install HOWTO
- The GBrowse 2.0 renderfarm HowTO at GMOD.org
Author
Todd Harris, PhD (info@toddharris.net)