OICR-Configuring the development machine
Contents
- 1 Address
- 2 Hardware
- 3 Server Configuration
- 3.1 Preparing directories and users
- 3.2 User and group accounts
- 3.3 Directories
- 3.4 Configure the FTP/Mirroring directory THIS IS NOT DONE YET
- 3.5 Perl modules
- 3.6 BIOGRAPHICS NOT DONE
- 3.7 Generic Genome Browser NOT DONE
- 3.8 AceDB
- 3.9 MySQL
- 3.10 Apache2 and mod_perl
- 3.11 Build ePCR NOT DONE
- 3.12 Installing BLAT
- 3.13 Installing BLAST
- 3.14 The WormBase Software
- 3.15 Configuring Servers To Start Automatically
- 3.16 Installing scripts to verify that the servers are running
- 3.17 Testing The Site
- 3.18 Blocking robots
- 4 AUTHOR
Address
wb-dev.oicr.on.ca
Hardware
The WormBase development server at OICR is a virtual server with the following stats:
- Debian Linux
- 500 GB disk space (mounted at /dev/hda1)
- 4 GB RAM
Server Configuration
All WormBase components are collected under a single directory: /usr/local/wormbase
$ ls /usr/local/wormbase acedb/ // The Acedb database (including bin directory) util/ // Utility components such as e-pcr and wublast extlib/ // Third party Perl libraries website-classic // The classic WormBase website website // The new-and-improved website!
Preparing directories and users
WormBase uses several user accounts for directory and server permissions. You will need to create these users and several preliminary directories. Creating a new user and group varies among Unix flavors. On most Linux systems, the following commands will create the new groups. You should have sudo privilege to execute these commands.
User and group accounts
These users should not have a login password. They are to establish privileges only.
- acedb group
This is the group that will have write privileges to the acedb directory tree. Acedb administrators should be added to this group.
$ /usr/sbin/groupadd acedb
- acedb user
This is the user that the acedb server will run as. It should be a member of the acedb group.
$ /usr/sbin/useradd -g acedb -d /usr/local/wormbase/acedb acedb
This useradd command also adds the new acedb user to the acedb group. Note that the acedb user's home directory was set to /usr/local/acedb, a directory which will be created in the next step.
- wormbase group
This is a group that will have write privileges to the wormbase directory tree. WormBase administrators and authors should be added to this group.
$ /usr/sbin/groupadd wormbase
This would be a good time to add yourself to the acedb and wormbase groups.
$ /usr/sbin/usermod -a -G acedb,wormbase [your_login_name]
[The '-a' argument keeps this command from deleting other, preexisting group memberships.]
You may need to re-login for these changes to take effect. Use the groups command to check which groups you are a member of:
% groups
Directories
The root container for all things WormBase:
- /usr/local/wormbase, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase $ chgrp wormbase /usr/local/wormbase $ chmod 2775 /usr/local/wormbase
- External Perl libraries: /usr/local/wormbase/extlib, owner=tharris group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/extlib $ chgrp wormbase /usr/local/wormbase/extlib $ chmod 2775 /usr/local/wormbase/extlib
- The "classic" website: /usr/local/wormbase/website-classic, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/website-classic $ chgrp wormbase /usr/local/wormbase/website-classic $ chmod 2775 /usr/local/wormbase/website-classic
- /usr/local/wormbase/website-classic/logs, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/website-classic/logs $ chgrp wormbase /usr/local/wormbase/website-classic/logs $ chmod 2775 /usr/local/wormbase/website-classic/logs
- /usr/local/wormbase/website-classic/cache, owner=nobody group=nobody mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/website-classic/cache $ chown nobody:nobody /usr/local/wormbase/website-classic/cache $ chmod 2775 /usr/local/wormbase/website-classic/cache
- The "util" directory contains components that apply to both the classic and updated site, like wublast and e-pcr.
- /usr/local/wormbase/util/wublast, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/util/wublast $ chgrp wormbase /usr/local/wormbase/util/wublast $ chmod 2775 /usr/local/wormbase/util/wublast
- /usr/local/wormbase/acedb, owner=acedb group=acedb,mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/acedb $ chown acedb:acedb /usr/local/wormbase/acedb $ chmod 2775 /usr/local/wormbase/acedb
Configure the FTP/Mirroring directory THIS IS NOT DONE YET
- ~ftp/pub/wormbase, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir ~ftp/pub/wormbase $ chgrp wormbase ~ftp/pub/wormbase $ chmod 2775 ~ftp/pub/wormbase
You may ignore this step if you do not plan to mirror the WormBase FTP site. In the examples below, the -p option is used to create the intermediate parents of directories if they don't already exist. If your mkdir doesn't support this option, you will need to create the intermediate directories manually.
Perl modules
CPAN / Environment configuration
I maintain a suite of Perl modules common to WormBase at:
/usr/local/wormbase/extlib
If you need to over-ride the default version of a module, place it in the extlib directory of either the classic or rearchitecure site:
/usr/local/wormbase/website-classic/extlib OR /usr/local/wormbase/website/extlib
(Note: It might by necessary to install gcc, curl, wget, unzip, bzip2, etc via apt-get before beginning)
Set up CPAN to build modules in the local library path (/usr/local/wormbase/extlib):
perl -MCPAN -e shell // Note that you DO NOT need to be sudo... cpan> o conf init (only necessary if not prompted)
For the Makefile.PL arguments, enter
INSTALL_BASE=/usr/local/wormbase/extlib
And for Build.PL enter
--install_base /usr/local/wormbase/extlib
Prepare/update your CPAN:
cpan> install CPAN cpan> reload CPAN
Before installing modules, you may need to set your PERL5LIB environment variable to point to include the extlib directory.
emacs ~/.bash_profile export PERL5LIB /usr/local/wormbase/extlib:/usr/local/wormbase/extlib/lib:/usr/local/wormbase/extlib/lib/perl5
Basic required modules
Install the following Perl modules via CPAN. Note that you DO NOT AND SHOULD NOT be sudo..
perl -I/usr/local/wormbase/extlib/lib -MCPAN -e shell
YAML LWP ExtUtils::MakeMaker Bundle::CPAN Cache::Cache Cache::FileCache CGI CGI::Session // CPAN installation fails in local dirs; tries to install man3 in system path. CGI::Cache Date::Calc Date::Manip // CPAN installation fails in local dirs; tries to install man3 in system path. --> Build and install Berkeley DB wget http://download.oracle.com/berkeley-db/db-4.7.25.tar.gz tar xzf db* cd db* cd build_unix ../dist/configure make sudo make install DB_File DBI DBD::mysql (mysql must be installed first) Digest::MD5 GD // 'sudo apt-get install libgd2-xpm-dev libgd2-xpm' first GD::SVG GD::Graph HTML::TokeParser IO::Scalar IO::String Image::GD::Thumbnail MIME::Lite Net::FTP Proc::Simple readline Search::Indexer SOAP::Lite Statistics::OLS Storable SVG SVG::Graph Test::Pod Text::Shellwords Time::Format WeakRef XML::SAX XML::Parser XML::DOM XML::Writer XML::Twig XML::Simple
Ace.pm (for the Classic WormBase site)
Ace.pm provides programmatic access to Acedb. You can install it via CPAN:
cpan> install Ace
During configuration, choose option (3), then set the remaining variables as follows:
Site-specific configuration files: /usr/local/wormbase/website-classic/conf CGI path: /usr/local/wormbase/website-classic/cgi-bin HTML path: /usr/local/wormbase/website-classic/html
BioPerl NOT DONE
You may install BioPerl either using anonymous CVS or by downloading and installing the most recent stable core.
To install BioPerl from the current stable release (which was once recommended):
% wget http://bioperl.org/DIST/current_core_stable.tar.gz % gunzip -c cur* | tar xvf - % cd bioperl-1.4 % perl Build.PL % ./Build test % sudo ./Build install
However, in March 2006 this stable release did not work with GBrowse-1.64. So the default choice is currently to install BioPerl from CVS. Installing from CVS will give you the latest version of BioPerl, but may also include unresolved bugs and experimental code.
% cvs -d :pserver:cvs@cvs.open-bio.org:/home/repository/bioperl login when prompted for the password, type 'cvs' % cvs -d:pserver:cvs:cvs@cvs.open-bio.org:/home/repository/bioperl co bioperl-live % cd bioperl-live % perl Build.PL % ./Build test % sudo ./Build install
(Note that BioPerl once used the old system of "perl Makefile.PL; make; make test; sudo make install", but that this has been superceded by the brave new world of Build.pm.)
This will create a directory named bioperl-live. In the future, when you wish to update to the most recent version, simply type "cvs update" in the bioperl-live directory.
Install BioPerl in the usual way, by running "perl Makefile.PL", "make", "make test" and "make install".
Finally, go into CPAN again and run:
cpan> install Bio::Das
Installing additional Perl modules
To install Perl modules that are not included on this list (or are new dependencies):
1. Your CPAN is configured to install to /usr/local/wormbase/extlib
You can set this in ~/.cpan/CPAN/MyConfig.pm
'makepl_arg' => q[INSTALL_BASE=/usr/local/wormbase/extlib],
OR
2. If building by hand, you call Makefile.PL as:
perl Make.PL INSTALL_BASE=/usr/local/wormbase/extlib
BIOGRAPHICS NOT DONE
Generic Genome Browser NOT DONE
This is a CGI script and some Perl modules that use Bio::DB::GFF and Bio::Graphics to create the main WormBase genome display. It lives at www.gmod.org. Like BioPerl, GBrowse can be installed via anonymous CVS or from the current stable release. As of June 2006, the current stable release (1.64) is the only one that actually works.
Via the latest stable release: $ wget http://easynews.dl.sourceforge.net/sourceforge/gmod/Generic-Genome-Browser-1.64.tar.gz $ gunzip -c Gene* | tar xvf -
CVS doesn't work yet (as of June 2006), but:
Via CVS: $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod login (When prompted for a password for anonymous, simply press the Enter key.) $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod co Generic-Genome-Browser
Note that: (1) version 1.64 is the latest stable release as of June 2006, but later versions will probably exist; (2) the clumsy "wget-longURL" command is a way of getting around Sourceforge's lack of easy, reliable URLs available to users for source-code packages.
On a Linux distribution using SELinux security (such as Fedora Core 4), compilation will fail until SELinux is in some way circumvented. One way to do this is:
$ setsebool -P httpd_disable_trans 1 $ /usr/local/apache/bin/httpd restart
Then enter the unpacked directory or that fetched by CVS and run the following incantation to install it in the proper place for WormBase:
$ perl Makefile.PL HTDOCS=/usr/local/wormbase/html \ CGIBIN=/usr/local/wormbase/cgi-perl/seq \ CONF=/usr/local/wormbase/conf \ --SELINUX=1 # disables SELinux $ make $ make install
The long Makefile.PL incantation can be cut-and-pasted from this document into the command line. The last argument of Makefile.PL is only needed if SELinux exists on the system.
AceDB
Installing from a binary package
$ cd /usr/local/wormbase/acedb/src $ curl -O ftp://ftp.sanger.ac.uk/pub/acedb/SUPPORTED/ACEDB-binaryLINUX_4.4.9.39.tar.gz $ curl -O ftp://ftp.sanger.ac.uk/pub/acedb/SUPPORTED/ACEDB-serverLINUX_4.4.9.39.tar.gz $ cd ../ ; mkdir bin-4.9.39 $ ln -s bin-4.9.39 bin ; cd bin $ tar xzf ../src/ACEDB*
Make sure that these files are executable and owned by root:
$ chown root:root ~acedb/bin/* # This sets both owner and group to 'root'.
Installing from source
$ tar xzf ACEDB-source* // CAUTION: Tarbomb. // Install a whole bunch of things: libgtk2.0-0 libgtk2.0-dev libglib, byacc, etc, etc // Modify the makefile: create a target for server programs (xace tace saceserver sgifacerver) // This is all I care about: SERVERS = xace tace saceserver sgifaceserver saceclient servers: $(SERVERS) $ export ACEDB_MACHINE=LINUX_4 $ make servers $ cp tace xace sgifaceserver saceserver saceclient ~acedb/bin/.
Testing the Installation
At this point, you can test whether the socket server runs correctly. Provided that you have added yourself to the acedb group, you can run the following command:
% ~acedb/bin/sgifaceserver ~acedb/wormbase // Database directory: /usr/local/wormbase/acedb/wormbase // Shared files: /usr/local/acedb // #### Server started at 2001-07-23_16:42:31 // #### host=mondseer.cshl.org listening port=23100 // #### Database dir=/usr/local/acedb/elegans // #### Working dir=/usr/local/acedb/elegans // #### clientTimeout=600 serverTimeout=600 maxKbytes=0 autoSaveInterval=600
// Server listening socket 28 created
The line "listening port=23100" indicates that the server is listening to port 23100. Open a new terminal window and use saceclient to confirm that you can communicate with the server:
% ~acedb/bin/saceclient localhost -port 23100 Please enter userid: anonymous Please enter passwd: acedb@localhost> find Sequence // Response: 65 bytes.
// Found 236493 objects in this class // 236493 Active Objects acedb@localhost> quit // Closing connection to server. // Client sent termination signal by server. // Response: 13 bytes. // A bientot // Please report problems to acedb@sanger.ac.uk // Bye
Configuring Acedb to start automatically under xinetd
Install xinetd (not standard in Debian!):
$ sudo apt-get install xinetd
Create a configuration file for acedb:
$ sudo emacs /etc/xinetd.d/acedb-wormbase # file: /etc/xinetd.d/acedb-wormbase # default: on # description: wormbase acedb database service acedb { protocol = tcp socket_type = stream port = 2005 flags = REUSE wait = yes user = acedb group = acedb log_on_success += USERID DURATION log_on_failure += USERID HOST server = /usr/local/wormbase/acedb/bin/sgifaceserver server_args = /usr/local/wormbase/acedb/wormbase 1200:1200:0 }
Edit /etc/services. Although xinetd is not supposed to use /etc/services, the following line must be added:
acedb 2005/tcp
Restart xinetd with the following command:
# /etc/init.d/xinetd reload (or restart)
You should now be able to talk to the database using saceclient:
% ~acedb/bin/saceclient localhost -port 2005
MySQL
Installation
Install mysql and various libraries via apt-get:
$ sudo apt-get install mysql-server-5.0 mysql-server
If it fails, then disable innodb by default. edit /etc/mysql/my.cnf file (uncomment the line):
#skip-innodb $ sudo apt-get purge mysql-server-5.0 mysql-server $ sudo apt-get install mysql-server-5.0 mysql-server
With this installation, databases are located at /var/lib/mysql. We want to able to write to this directory from the command line, so:
$ sudo chmod 2775 /var/lib/mysql
Mysqld will automatically be setup to launch at server boot (rc3 and rc5) -- no need to mess with init scripts.
Set up mysql permissions
# mysql -u root -pPASSWORD mysql> grant select on elegans.* to nobody@localhost;
Repeat for:
- c_briggsae
- c_japonica
- c_remanei
- c_brenneri
- p_pacificus
- b_malayi
- c_elegans_gmap
- c_elegans_pmap
- autocomplete
- h_bacteriophora
Apache2 and mod_perl
Installation
sudo apt-get install apache2 sudo apt-get install libapache2-mod-perl2
Configuration
We will set up WormBase to be a virtual host running on the default port 80. Add the following configuration directives to /etc/apache2/sites-enabled/000-default:
<VirtualHost *:80> Include /usr/local/wormbase/website-classic/conf/httpd.conf // ServerName dev.wormbase.org // UseCanonicalName on </VirtualHost>
Be certain to remove or comment out the default configuration.
Modify /usr/local/wormbase/website-classic/conf/httpd.conf to match the new directory layout.
TODO: I NEED TO ACCOUNT FOR OTHER VIRUTAL SERVERS RUNNING ON THE DEVELOPMENT SITE
Build ePCR NOT DONE
- e-PCR (modified version, required for e-PCR search page)
This is located in the directory /usr/local/wormbase/e-PCR, which will come into existence after the WormBase site update program wb_update_wormbase.pl has been successfully run (see below for details). Once the directory has been generated, run:
$ cd /usr/local/wormbase/e-PCR # Edit 'makefile' to run install rather than ginstall, which doesn't exist on Fedora Linux $ make $ make install # or just run 'install e-PCR /usr/local/bin'
The file /usr/local/wormbase/e-PCR/README-Wormbase describes the changes that were made to the original e-PCR distribution.
Installing BLAT
Jim Kent's BLAT (blast-like alignment tool) is a fast nucleotide aligner used by the blast search page. If you do not plan to support blast searches, you may safely skip this step.
# mkdir -p /usr/local/blat/bin ; cd /usr/local/blat/bin % wget http://www.soe.ucsc.edu/~kent/exe/linux/blatSuite.33.zip (for Intel Linux) % unzip blatSuite.33.zip % rm blatSuite.33.zip version.doc 11.ooc
Note that this choice gives precompiled binaries for an Intel-based Linux distribution as of March 2006. It would probably be worth checking http://www.soe.ucsc.edu/~kent/exe/linux to see if there is a more up-to-date version than 33. Also, other operating systems will need other binaries. E.g., for Mac OS X, instead run:
% wget http://www.soe.ucsc.edu/~kent/exe/osX/blatSuite.33.zip
For other types of operating systems (e.g., Linux on Opteron-based machines), see http://www.soe.ucsc.edu/~kent/exe/ for the available choices.
The blat server will be started automatically by the update script. For reference, the blat server is launched using the following command.
% /usr/local/blat/bin/gfServer start localhost 2003 \ /usr/local/wormbase/blat/*.nib & > /dev/null 2>&1
Installing BLAST
The Blast page requires WU-BLAST. This is a closed-source derivative of NCBI's BLAST. However, WU-BLAST is free to academic users (with licensing) and is thought to have performance advantages over NCBI-BLAST; it can be downloaded from http://blast.wustl.edu/. A typical choice of WU-BLAST for Linux is blast2.linux26-i686.tar.gz.
Conversely, the Blast page can be deactivated if you don't want to provide BLAST searches at your site.
By default, WormBase expects WU-BLAST to be installed in /usr/local/wublast. This is the directory structure used by WormBase:
% ls -l /usr/local/wublast ls -l /usr/local/wublast total 72 lrwxrwxrwx 1 root root 18 May 7 12:26 BLOSUM62 -> matrix/aa/BLOSUM62 -rw-r--r-- 1 root root 46789 Feb 5 1998 HISTORY -rw-r--r-- 1 root root 6648 Mar 4 1997 README drwxr-xr-x 2 root root 4096 May 7 12:46 bin/ lrwxrwxrwx 1 root root 25 Jul 24 08:20 databases -> /usr/local/wormbase/blast/ drwxr-xr-x 2 root root 4096 Jan 27 2000 filter/ drwxr-xr-x 4 root root 4096 Oct 4 1998 matrix/
which can be set up in this manner (adapt to your system):
$ cd /usr/local/wublast $ zcat /usr/local/TGZ/blast2.linux26-i686.tar.gz | tar xf - $ chown -R root:root * $ mkdir bin $ mv *fasta tblast* blast* *db xd* memfile pam wu-blastall bin $ ln -s /usr/local/wormbase/blast databases
The important thing to note is that the databases directory is a symbolic link to /usr/local/wormbase/blast. This is where the update_wormbase.pl script (described in the next section) dumps its BLAST databases.
The WormBase Software
Check out the WormBase software from CVS:
$ cd /usr/local/wormbase $ cvs -d formaggio.cshl.org:/usr/local/cvs_repository co wormbase-website $ mv wormbase-website website-classic
Configure localsdef.pm
- $HOST
This is the name of the host where the socket server runs. It is set to "localhost" by default.
- $PORT
This is the port on which the socket server runs, 2005 by default.
- $ACEPASS, $USERNAME, $PASSWORD
These three items define the acedb username and password.
- $MYSQL_HOST, $MYSQL_USER, $MYSQL_PASS
These three items define the mysql host, username, and password.
- $MASTER
This is used only for the WormBase master site. Should be set to 0.
- $MIRROR
Whether or not the site is a mirror. Should be set to the name of the mirror.
- $DEVELOPMENT
Whether or not the site is a development site. Internally, this controls the nature of caching on the site. Should be set to 0.
- $BLAST2WORMBASE, $WORMBASE2BLAST
These two options control where the blast script directs queries, and where those queries are returned. This is provided in the event that a second standalone blast server is provided. If not, these two options should point to:
$WORMBASE2BLAST=http://your.hostname.org/
Configuring Servers To Start Automatically
The final step is to arrange for Acedb to start automatically and for MySQL to restart if necessary.
Installing MySQL and BLAT monitoring scripts
Run:
$ cp -i /usr/local/wormbase/util/admin/blat_server.initd /etc/rc.d/init.d/blat_server
Then run:
$ crontab -u root -e
to add the following entries to root's crontab:
0 * * * * /usr/local/wormbase/util/admin/restart_mysqld.pl 0 * * * * /usr/local/wormbase/util/admin/restart_blat.pl
Acedb log rotation
Acedb generates massive log files. To keep these from growing too large, add the following entry to root's crontab (or that of another privileged user):
10 1 * * * /usr/local/wormbase/bin/rotatelogs.pl
Installing scripts to verify that the servers are running
Two scripts in the WormBase directory can be used to ensure that the mysql and blat servers are running. To install, them:
% sudo cp /usr/localwormbase/util/admin/blat_server.initd \ /etc/rc.d/init.d/blat_server
Place the restart scripts under cron control of a privileged user. These commands will check every hour to see that the servers are running.
% sudo crontab -u root -e
0 * * * * /usr/local/wormbase/util/admin/restart_mysqld.pl 0 * * * * /usr/local/wormbase/util/admin/restart_blat.pl
At the same time, you might also wish to automate the rotatation of logs to prevent them from growing to an unwieldy size. You'll find an appropriate log rotation configuration stanza in util/rotate_wormbase_logs and a log rotate script in /usr/local/wormbase/bin/rotatelogs.pl. You will need both.
# Rotate httpd logs 10 1 * * * /usr/local/wormbase/bin/rotatelogs.pl # Rotate acedb logs 10 1 * * * logrotate /usr/local/wormbase/util/rotate_wormbase_logs
This stanza will check that the acedb server logs do not grow larger than 100 MB.
Testing The Site
At this point, all components of a WormBase installation have been installed. You can test your installation by restarting the various server components of WormBase.
Restarting AceDB
# Via xinetd: $ /etc/init.d/xinetd reload (or restart)
# ...or using saceclient % saceclient localhost -port 2005 acedb> password: acedb> shutdown now
Restarting MySQL
# Via mysqladmin... % mysqladmin -uroot -pPASSWORD shutdown # or using init.d $ /etc/init.d/mysql restart
Restarting Apache
When the configuration files have been checked and adjusted, restart Apache with the following command:
$ /etc/init.d/apache restart
Check /usr/local/wormbase/logs/classic-error_log for WormBase-specific errors and /var/log/apache2/error_log for general errors.
BLAT
% /usr/local/blat/bin/gfServer start localhost 2003 \ /usr/local/wormbase/blat/*.nib & > /dev/null 2>&1
Blocking robots
It can be useful to block search engines (such as Google) from crawling over one's mirror. To do this, go to /usr/local/wormbase/html, and make a file called "robots.txt" with the following contents:
User-agent: * Disallow: /
AUTHOR
Todd Harris (toddwharris@gmail.com)