Former How To Build A WormBase Mirror

From WormBaseWiki
Jump to: navigation, search

Contents

IMPORTANT NOTE

PLEASE NOTE THAT AS OF 2009.09, THIS DOCUMENTATION IS DEPRECATED.

Should you wish to install WormBase locally, please see:

http://www.wormbase.org/wiki/index.php/WormBase_Development_Environment

Todd Harris (todd@wormbase.org)


Hardware requirements

WormBase runs on a Unix or Linux system (with the caveat that Mac OS X is, essentially, a Unix system). A relatively fast system with generous memory is strongly recommended. The minimum suggested hardware is:

  • 900 Mhz Pentium III or higher, 1.5 GHz G4 or higher
  • 2 gigabytes of RAM, 4 preferable
  • 4 gigabytes of swap space
  • 30 gigabytes free disk space

Each database occupies approximately 10 gigabytes of disk space, and you will need at least twice that in order to stage and unpack new versions of the database. In addition, count on having another gigabyte used by BLAST databases.

WormBase partly runs on ACeDB (a somewhat object-oriented database) while also partly running on relational databases. For this reason, it requires both ACeDB and MySQL to be installed. The middleware layer for ACeDB is AcePerl, and the middleware layer for MySQL is the Bio::DB::GFF module, which comes with the BioPerl package.

Software requirements

You will need the following software packages:

  • e-PCR (modified version, required for e-PCR search page)

This is located in the directory /usr/local/wormbase/e-PCR, which will come into existence after the WormBase site update program wb_update_wormbase.pl has been successfully run (see below for details). Once the directory has been generated, run:

   $ cd /usr/local/wormbase/e-PCR
   # Edit 'makefile' to run install rather than ginstall, which doesn't exist on Fedora Linux
   $ make
   $ make install   # or just run 'install e-PCR /usr/local/bin'

The file /usr/local/wormbase/e-PCR/README-Wormbase describes the changes that were made to the original e-PCR distribution.

  • The WormBase Web software

Like e-PCR, this is obtained automatically by running wb_update_wormbase.pl.

Document conventions

The commands presented in this document are tailored for an out-of-the-box Red Hat-style Linux (Fedora Core 4 [FC4] or Fedore Core 5 [FC5]) installation. These commands may vary on your system.

The "$" prompt is used throughout this document to denote the root command-line prompt. Root-level commands should be issued as the root user, either by logging in as root (via "su") or by using "sudo" as a normal user. Normal user prompts (whether sudo-ized or not) are denoted as "%". Long lines that should be entered as a single line at the command line are split by '\' to increase legibility. You can safely enter these backslashes at the command line as well.

Detailed instructions

Upgrade Perl to 5.8.8

On Fedora Core 4, as of 6/19/2006, it was still necessary to upgrade to Perl 5.8.8 by source code (shown below). However, on Fedora Core 5, Perl 5.8.8 had become available as a packaged update available by running:

   $ yum update perl

as root. To check and see if such an update is available, run:

   $ yum info perl

If that fails or isn't available, get 5.8.8 from source code:

% sudo perl -MCPAN -e shell
cpan> install N/NW/NWCLARK/perl-5.8.8.tar.gz

Choose all of the defaults EXCEPT:

Choose to build a threaded Perl.

When asked:

What shall I put after the #! to start up perl ("none" to not use #!)?

Enter:

/usr/bin/perl

Quit cpan and restart it to reconfigure CPAN.pm against 5.8.8.

% sudo perl -MCPAN -e shell
cpan> install Bundle::CPAN

Installing Perl modules

Install the following Perl modules, all of which are available at CPAN, with the listed version or higher:

 LWP
 Bundle::CPAN
 Cache::Cache
 Cache::FileCache
 CGI
 CGI::Session
 CGI::Cache
 Date::Calc
 Date::Manip
 DB_File
 DBI
 DBD::mysql
 Digest::MD5
 GD
 GD::SVG
 GD::Graph
 HTML::Parser
 HTML::TokeParser
 IO::Scalar
 IO::String
 Image::GD::Thumbnail
 MIME::Lite
 Net::FTP
 Proc::Simple
 readline
 Search::Indexer
 SOAP::Lite
 Statistics::OLS
 Storable
 SVG
 SVG::Graph
 Term::ReadLine
 Test::Pod
 Text::Shellwords
 Time::Format
 WeakRef
 XML::SAX  [optional]
 XML::Parser  [optional]
 XML::DOM  [optional]
 XML::Writer  [optional]
 XML::Twig  [optional]
 XML::Simple  [optional]
 

If the XML modules are present, the Genome Browser will be able to dump GAME and BSML versions of the sequence annotations. If the optional GD::SVG and SVG modules are present, the genome browser will be able to produce output in the Scalable Vector Graphics format. The GTop modules add monitoring capabilites of apache.

One easy and standard way to install the required Perl modules is with the CPAN module:

 % sudo perl -MCPAN -e shell

The first time you run this, it will go through some configuration steps. After this you will be presented with the "cpan>" prompt. Type "?" to get a list of options. To install modules, type "install <module_name>". For example, here's how to install the "Bundle::CPAN" module, which bundles together a number of modules recommended by CPAN:

 cpan> install Bundle::CPAN

In Perl 5.10, there will be an alternative method using CPANPLUS and "cpanp i"-based shell scripts. However, this is not yet straightforward to set up on Fedora Core 4.

Preparing directories and users

WormBase uses several user accounts for directory and server permissions. You will need to create these users and several preliminary directories. Creating a new user and group varies among Unix flavors. On most Linux systems, the following commands will create the new groups. You should have sudo privilege to execute these commands.

User and group accounts

These users should not have a login password. They are to establish privileges only.

  • acedb group

This is the group that will have write privileges to the acedb directory tree. Acedb administrators should be added to this group.

$ /usr/sbin/groupadd acedb
  • acedb user

This is the user that the acedb server will run as. It should be a member of the acedb group.

$ /usr/sbin/useradd -g acedb -d /usr/local/acedb acedb

This useradd command also adds the new acedb user to the acedb group. Note that the acedb user's home directory was set to /usr/local/acedb, a directory which will be created in the next step.

  • wormbase group

This is a group that will have write privileges to the wormbase directory tree. WormBase administrators and authors should be added to this group.

$ /usr/sbin/groupadd wormbase

This would be a good time to add yourself to the acedb and wormbase groups.

$ /usr/sbin/usermod -a -G acedb,wormbase [your_login_name]

[The '-a' argument keeps this command from deleting other, preexisting group memberships.]

You may need to re-login for these changes to take effect. Use the groups command to check which groups you are a member of:

% groups

Directories

Create the following directories:

  • /usr/local/acedb, owner=acedb group=acedb,mode=drwxrwsr-x
$ mkdir /usr/local/acedb
$ chown acedb:acedb /usr/local/acedb
$ chmod 2775 /usr/local/acedb
  • /usr/local/wormbase, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase
$ chgrp wormbase /usr/local/wormbase
$ chmod 2775 /usr/local/wormbase
  • /usr/local/wormbase/logs, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/logs
$ chgrp wormbase /usr/local/wormbase/logs
$ chmod 2775 /usr/local/wormbase/logs
  • /usr/local/wormbase/cache, owner=nobody group=nobody mode=drwxrwsr-x
$ mkdir /usr/local/wormbase/cache
$ chown nobody:nobody /usr/local/wormbase/cache
$ chmod 2775 /usr/local/wormbase/cache
  • ~ftp/pub/wormbase, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir ~ftp/pub/wormbase
$ chgrp wormbase ~ftp/pub/wormbase
$ chmod 2775 ~ftp/pub/wormbase

You may ignore this step if you do not plan to mirror the WormBase FTP site. In the examples below, the -p option is used to create the intermediate parents of directories if they don't already exist. If your mkdir doesn't support this option, you will need to create the intermediate directories manually.

  • /usr/local/wublast, owner=root group=wormbase mode=drwxrwsr-x
$ mkdir /usr/local/wublast
$ chgrp wormbase /usr/local/wublast
$ chmod 2775 /usr/local/wublast

You may safely ignore this step if you do not plan to support the Blast/Blat search page.

The "s" bit in the group permissions for these directories ensures that new directories and files created within them will be owned by the same group as the directory. This allows groups of administrators to have read/write access to project files. For this to work, however, these individuals' default umask must be set to 002 when they log in.

Installing AceDB

You must have a working Acedb socket server to support WormBase. For best results the server should be running on the same machine as the WormBase web site. This process is explained in detail because it is the trickiest part of installing a WormBase site. You may install Acedb from source or binary packages.

Installing Acedb from a binary package

The following commands will fetch the latest version (here 4_9t) of Acedb into a temporary build directory.

% cd ~/build
% mkdir acedb ; cd acedb
% curl -O ftp://ftp.sanger.ac.uk/pub/acedb/SUPPORTED/ACEDB-STATIC_serverLINUX.4.9.30.tar.gz
% curl -O ftp://ftp.sanger.ac.uk/pub/acedb/SUPPORTED/ACEDB-STATIC_binaryLINUX.4.9.30.tar.gz
% gunzip -c ACEDB-* | tar xvf -
     If this fails, try unpacking each item independently
       % zcat /usr/local/TGZ/ACEDB-STATIC_binaryLINUX.4.9.30.tar.gz | tar xf -
       % zcat /usr/local/TGZ/ACEDB-STATIC_serverLINUX.4.9.30.tar.gz | tar xf -
% mv ACEDB-* ~/mirror/src/.   # Stow the source if you'd prefer

Now copy the unpacked binaries to the acedb bin, directory, altering the permissions as appropriate.

$ mkdir ~/acedb/bin
$ mv * ~/acedb/bin/.
$ chown root:root ~acedb/bin/*       # This changes both owner and group to root.

Installing Acedb from source

Download the most recent release of Acedb (which, in March 2006, is ACEDB-source.4.9.30.tar.gz) from http://www.acedb.org. Note that you should not unpack the tar.gz source file, which will be uncompressed and untarred automatically by the INSTALL script (see later for details). Also download the following files:

 NOTES
 README (brief list/description of files)
 INSTALL (a ksh script, first time installation script)

Place all the .tar.gz files you need plus INSTALL into this directory ~acedb/. The INSTALL script will install packages in your current working directory.

 % chmod u+x INSTALL
 $ ./INSTALL
 Note: make the .tar.gz files readable by acedb group

In the terminal, you will see:

 ...
 directory permissions OK...
 Files will be owned by: "tharris"
 Files will be installed in: /usr/local/acedb

The script will install the package as a subdirectory of the current directory. If such a directory exists you will be given a chance to install that package or abort the installation altogether.

To build Acedb with system-wide GNU software libraries, you need to first:

 # cd source

Then you need to tell the makefile which of many definitions for the environment variable $ACEDB_MACHINE you want. The list available in acedb 4.x.30 is long (to see the full list, type "make"), but the ones most likely to be relevant are LINUX_4, LINUX_4_RH9, LINUX_GTK2_4, LINUX_MAC_4, and MACOSX_4. (Note that for some arcane reason, these variables are listed by "make" with a suffix "_DEF" that should not actually be used in $ACEDB_MACHINE.)

For instance:

 # export ACEDB_MACHINE=LINUX_4     # this is the bash-shell version of 'setenv'
 # make all

If the source code builds properly, you can then find all binaries in the bin.LINUX_4 directory. (However, as of June 2006, it was not obvious whether any of these variables worked with Fedora Core 4! If this proves intractable, fall back on using precompiled binaries...)

Create the directory /usr/local/acedb/bin (referred to as ~acedb/bin later, provided that you created an acedb user). Copy the following files to this directory:

  saceserver
  saceclient
  sgifaceserver
  tace
  giface
  xace
  makeUserPasswd

Make sure that these files are executable and owned by root:

  $ chown root:root ~acedb/bin/*      # This sets both owner and group to 'root'.

Installing Ace.pm

Ace.pm provides programmatic access to Acedb. You can install it via CPAN:

 cpan> install Ace

The Ace module is the most recent version of AcePerl. When you install it, you will be asked whether to build the pure Perl version, an optimized version for sockets only, or an optimized version that works with sockets and the older RPC-based server. Choose either option (2) or (3).

When you install Ace for the first time it will also ask you whether you want to install AceBrowser. Answer "yes" the very first time you install it (you do not need to answer in the affirmative for subsequent updates). It will then ask you to choose paths for its configuration files. For WormBase installs, these are the right values to choose:

  Site-specific configuration files:  /usr/local/wormbase/conf
                           CGI path:  /usr/local/wormbase/cgi-bin
                          HTML path:  /usr/local/wormbase/html

Note: Perl scripts that use Ace.pm need to know where the ACeDB binaries (e.g., giface) are on the system -- they don't just automagically find them. So it is a good idea to add "/usr/local/acedb/bin" to the systemwide PATH in /etc/profile before trying to run Ace, even in one-off Perl scripts that do simple data mining.

Installing BioPerl

You may install BioPerl either using anonymous CVS or by downloading and installing the most recent stable core.

To install BioPerl from the current stable release (which was once recommended):

% wget http://bioperl.org/DIST/current_core_stable.tar.gz
% gunzip -c cur* | tar xvf -
% cd bioperl-1.4
% perl Build.PL
% ./Build test
% sudo ./Build install

However, in March 2006 this stable release did not work with GBrowse-1.64. So the default choice is currently to install BioPerl from CVS. Installing from CVS will give you the latest version of BioPerl, but may also include unresolved bugs and experimental code.

% cvs -d :pserver:cvs@cvs.open-bio.org:/home/repository/bioperl login
   when prompted for the password, type 'cvs'
% cvs -d:pserver:cvs:cvs@cvs.open-bio.org:/home/repository/bioperl co bioperl-live
% cd bioperl-live
% perl Build.PL
% ./Build test
% sudo ./Build install

(Note that BioPerl once used the old system of "perl Makefile.PL; make; make test; sudo make install", but that this has been superceded by the brave new world of Build.pm.)

This will create a directory named bioperl-live. In the future, when you wish to update to the most recent version, simply type "cvs update" in the bioperl-live directory.

Install BioPerl in the usual way, by running "perl Makefile.PL", "make", "make test" and "make install".

Finally, go into CPAN again and run:

cpan> install Bio::Das

Installing and configuring MySQL

Installing MySQL

MySQL is extremely well documented and included in many package repositories. A general reference is Managing & Using MySQL by Reese, Yarger & King with Williams (O'Reilly); many specific tasks are easily deciphered with on-line documentation or by Googling queries.

Just follow the installation instructions and set it up to start automatically when the server is booted. For Linux Fedora Core 4+, the simplest way to do this is through the Service Configuration GUI:

If administering remotely:

  $ ssh -Y root@server.machine                         # from your machine
  $ /usr/bin/python /usr/sbin/system-config-services   # on server.machine

or

  $ ssh -Y user@server.machine                         # from your machine
  $ su                                                 # on server.machine
  $ /usr/bin/python /usr/sbin/system-config-services

If administering locally:

  $ su                                                 # on server.machine
  $ /usr/bin/python /usr/sbin/system-config-services   # on server.machine

This will provide a "Service Configuration" GUI. Pick run level 3 from the Edit Runlevel menu; then, in the Service Configuration window, check mysql and click on the Start toolbar button, then the Save button. Then pick run level 5 and make sure mysql is checked off (since it should only need to start once during run level 3 to also be available for run level 5).

If working on a computer without pre-existing MySQL in its distribution, one can install from a binary distribution (without using a package manager). The commands are as follows (assuming an unpacked mysql distrbution!):

  % groupadd mysql
  % useradd -g mysql mysql
  % cd /usr/local
  % gunzip < /PATH/TO/MYSQL-VERSION-OS.tar.gz | tar xvf -
  % ln -s FULL-PATH-TO-MYSQL-VERSION-OS mysql
  % cd mysql
  % scripts/mysql_install_db --user=mysql
  % chown -R root  .
  % chown -R mysql data
  % chgrp -R mysql .
  % bin/mysqld_safe --user=mysql &
  # So the server starts up automatically, do:
  % cp support-files/mysql.server /etc/rc.d/init.d/mysql
  # Start the server by:
  % sudo /etc/rc.d/init.d/mysql start

Having gotten MySQL running either from pre-existing software (e.g., on FC4) or from a binary distribution, one must then configure it:

  # Set up a root password
  % mysqladmin -uroot password PASSWORD

Creating MySQL databases

WormBase requires a number of MySQL databases. These include

  • elegans - genome annotations for C. elegans
  • elegans_pmap - physical map coordinates for C. elegans
  • elegans_gmap - genetic map coordinates for C. elegans
  • remanei - genome annotations for C. remanei
  • briggsae - genome annotations for C. briggsae

These databases will be mirrored automatically. In advance you should grant suitable privileges so the web server can access these databases.

In the following walkthrough, the mysql administrator's name is "root" and the password is "PASSWORD". The current user's login name is assumed to be "ME". Substitute for these values as appropriate (presumably not using names in BIFF-like ALL-CAPS).

Create the elegans database, then give yourself write permissions for the database:

# mysql -u root -pPASSWORD
 mysql> create database elegans;
 mysql> grant all privileges on elegans.* to ME@localhost;
 mysql> grant file on *.* to ME@localhost;
 mysql> grant select on elegans.* to nobody@localhost;
 

[Note: this could be done with "mysqladmin -uroot -pPASSWORD create elegans", but that's superfluous given commands below.]

The final command gives the "nobody" user (typically httpd) read access to the database. Note that this is set up so that you do not have to type a password to upload information into these databases provided that you are logged into the database machine.

Repeat the steps to establish MySQL databases for the remaining databases listed above. Then:

mysql> quit;

If it turns out you've made a mistake, you can abolish privileges:

mysql> revoke all privileges on *.* from ME@localhost;

Installing Apache and mod_perl

To a large extent, installing Apache/mod_perl is exactly as described in the documentation that accompanies these packages. However, you must be careful to use mod_perl's Makefile.PL to configure and build Apache, as it deactivates the built-in expat library that comes with Apache. Otherwise, you will be unable to run the WormBase pages that rely on XML parsing.

NOTE: on distributions like Red Hat's Fedora Core 4, the simplest way of installing the operating system on a new computer is to select the "everything" option for what packages are to be installed. While this is convenient, it complicates later installation of Apache and mod_perl for WormBase. The WormBase site software uses somewhat archaic versions of Apache/mod_perl (1.3). Meanwhile, FC4 preinstalls binaries for Apache/mod_perl based on a totally different, newer version of Apache (2.0+). These pre-existing versions must be uninstalled before building Apache/mod_perl (1.3) for WormBase from source code. Moreover, figuring out just how many packages need to be uninstalled along with these is somewhat empirical, since current distributions like FC4 tend to have many interdependent extra packages that go along with these.

The simplest way to do this is probably through the yum package management system. Try:

   $ yum remove httpd mod_perl

With luck this will cleanly remove httpd and mod_perl RPMs ... along with many, many other RPMs that are dependent on them. There should be a dialogue where the program lists these packages and asks you if you really want them removed. But a single "y" answer should at least give you a full uninstall of this hairball.

If for some reason yum is not an option, you'll probably have to use the much more brittle "rpm -evv" method. One way to get going on FC4 is to run:

   $ rpm -qa | grep httpd      (to confirm that httpd is a pre-installed RPM)
   $ rpm -qa | grep mod_perl   (to confirm that mod_perl is a pre-installed RPM)
   $ rpm -evv httpd mod_perl

FC4 will typically loudly refuse to carry out a simple "rpm -evv" order for just httpd and mod_perl alone, spitting out a long list of packages that will be broken if these two packages are removed. The simplest way to deal with this is to then order FC4 to delete them all (cutting-and-pasting their names from the long error report, and rerunning "rpm -evv" with all these packages as an argument). Generally this will lead to yet another error message (though, one hopes, a shorter one) bewailing yet more endangered dependencies. Add these packages to the list of deletia, and re-rerun "rpm -evv". Iteratively, one will eventually converge on a package list that FC4 will in fact permit to be uninstalled, and that will include httpd and mod_perl.

Unpack Apache and mod_perl into two side-by-side directories:

drwxr-xr-x    8 tharris       4096 Jul 16 15:42 apache_1.3.17/
drwxr-xr-x   24 tharris       4096 Jul 16 14:45 mod_perl-1.25/

Enter the mod_perl directory, and run this command:

   $ perl Makefile.PL DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
       APACHE_PREFIX=/usr/local/apache \
       APACI_ARGS='--enable-shared=info --enable-shared=status'

The primary installation site of WormBase also needs the proxy module and uses this configuration command.

   $ perl Makefile.PL DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
       APACHE_PREFIX=/usr/local/apache \
       APACI_ARGS='--enable-shared=info --enable-shared=status \
                   --enable-shared=proxy  --enable-module=rewrite'

The APACI_ARGS option in this command turns on two Apache modules that give status information. They are not strictly necessary for WormBase.

After a bunch of diagnostic information, run the following two commands:

   $ make
   $ make test

The LWP library must be installed before you "make test". If the tests are successful, run:

   $ make install

This will install Apache into the directory /usr/local/apache.

Installing the Generic Genome Browser

This is a CGI script and some Perl modules that use Bio::DB::GFF and Bio::Graphics to create the main WormBase genome display. It lives at www.gmod.org. Like BioPerl, GBrowse can be installed via anonymous CVS or from the current stable release. As of June 2006, the current stable release (1.64) is the only one that actually works.

Via the latest stable release:
   $ wget http://easynews.dl.sourceforge.net/sourceforge/gmod/Generic-Genome-Browser-1.64.tar.gz
   $ gunzip -c Gene* | tar xvf -

CVS doesn't work yet (as of June 2006), but:

 Via CVS:
   $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod login
     (When prompted for a password for anonymous, simply press the Enter key.)
   $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod co Generic-Genome-Browser

Note that: (1) version 1.64 is the latest stable release as of June 2006, but later versions will probably exist; (2) the clumsy "wget-longURL" command is a way of getting around Sourceforge's lack of easy, reliable URLs available to users for source-code packages.

On a Linux distribution using SELinux security (such as Fedora Core 4), compilation will fail until SELinux is in some way circumvented. One way to do this is:

   $ setsebool -P httpd_disable_trans 1
   $ /usr/local/apache/bin/httpd restart

Then enter the unpacked directory or that fetched by CVS and run the following incantation to install it in the proper place for WormBase:

   $ perl Makefile.PL HTDOCS=/usr/local/wormbase/html \
                  CGIBIN=/usr/local/wormbase/cgi-perl/seq \
                  CONF=/usr/local/wormbase/conf \
                  --SELINUX=1                            # disables SELinux
   $ make
   $ make install

The long Makefile.PL incantation can be cut-and-pasted from this document into the command line. The last argument of Makefile.PL is only needed if SELinux exists on the system.

Installing BLAT

Jim Kent's BLAT (blast-like alignment tool) is a fast nucleotide aligner used by the blast search page. If you do not plan to support blast searches, you may safely skip this step.

# mkdir -p /usr/local/blat/bin ; cd /usr/local/blat/bin
 % wget http://www.soe.ucsc.edu/~kent/exe/linux/blatSuite.33.zip  (for Intel Linux)
% unzip blatSuite.33.zip
% rm blatSuite.33.zip version.doc 11.ooc

Note that this choice gives precompiled binaries for an Intel-based Linux distribution as of March 2006. It would probably be worth checking http://www.soe.ucsc.edu/~kent/exe/linux to see if there is a more up-to-date version than 33. Also, other operating systems will need other binaries. E.g., for Mac OS X, instead run:

 % wget http://www.soe.ucsc.edu/~kent/exe/osX/blatSuite.33.zip

For other types of operating systems (e.g., Linux on Opteron-based machines), see http://www.soe.ucsc.edu/~kent/exe/ for the available choices.

The blat server will be started automatically by the update script. For reference, the blat server is launched using the following command.

% /usr/local/blat/bin/gfServer start localhost 2003 \
     /usr/local/wormbase/blat/*.nib & > /dev/null 2>&1

Installing BLAST

The Blast page requires WU-BLAST. This is a closed-source derivative of NCBI's BLAST. However, WU-BLAST is free to academic users (with licensing) and is thought to have performance advantages over NCBI-BLAST; it can be downloaded from http://blast.wustl.edu/. A typical choice of WU-BLAST for Linux is blast2.linux26-i686.tar.gz.

Conversely, the Blast page can be deactivated if you don't want to provide BLAST searches at your site.

By default, WormBase expects WU-BLAST to be installed in /usr/local/wublast. This is the directory structure used by WormBase:

% ls -l /usr/local/wublast
ls -l /usr/local/wublast
total 72
lrwxrwxrwx  1 root  root     18 May  7 12:26  BLOSUM62 -> matrix/aa/BLOSUM62
-rw-r--r--  1 root  root  46789 Feb  5  1998  HISTORY
-rw-r--r--  1 root  root   6648 Mar  4  1997  README
drwxr-xr-x  2 root  root   4096 May  7  12:46 bin/
lrwxrwxrwx  1 root  root     25 Jul 24  08:20 databases -> /usr/local/wormbase/blast/
drwxr-xr-x  2 root  root   4096 Jan 27  2000  filter/
drwxr-xr-x  4 root  root   4096 Oct  4  1998  matrix/

which can be set up in this manner (adapt to your system):

$ cd /usr/local/wublast
$ zcat /usr/local/TGZ/blast2.linux26-i686.tar.gz | tar xf -
$ chown -R root:root *
$ mkdir bin
$ mv *fasta tblast* blast* *db xd* memfile pam wu-blastall bin
$ ln -s /usr/local/wormbase/blast databases

The important thing to note is that the databases directory is a symbolic link to /usr/local/wormbase/blast. This is where the update_wormbase.pl script (described in the next section) dumps its BLAST databases.

Installing the WormBase files

Most of the necessary files for a WormBase installation will be installed by the wb_update_wormbase.pl script. This script is installed as a part of the Bio::WormBase suite of Perl modules.

Installing Bio::WormBase.pm

Bio::WormBase.pm is maintained on Sourceforge. You should install it from CVS.

Anonymous access works with:

cvs -d:pserver:anonymous@bio-wormbase.cvs.sourceforge.net:/cvsroot/bio-wormbase login
cvs -z3 -d:pserver:anonymous@bio-wormbase.cvs.sourceforge.net:/cvsroot/bio-wormbase co -P Bio-WormBase.pm

As before, just use "[enter]" (i.e., nothing) as the password.

Developer access works, but requires a developer account and password:

export CVS_RSH=ssh
cvs -z3 -d:ext:DEVELOPERNAME@bio-wormbase.cvs.sourceforge.net:/cvsroot/bio-wormbase co -P Bio-WormBase.pm

Both these access methods worked as of 6/1/2006.

To build Bio::WormBase.pm, enter the directory and do:

 # perl Makefile.PL
 # make
 # sudo make install        # or just 'make install' if root

You can configure the installation by passing the following variables to the "perl Makefile.PL" command:

  MYSQLDATADIR= the full path to your mysql data dir
  WBADMIN= the email address of the WormBase administrator
  TMPDIR= the full path to a temporary directory for holding updates

For more arguments, run:

  # perl Makefile.PL -h

On Fedora Core 4+, the three variables above are:

   /var/lib/mysql
   [your.working@email.address]
   [the default, '.', should be OK]

So for Fedora Core 4+, these commands work:

  # perl Makefile.PL MYSQLDATADIR=/var/lib/mysql WBADMIN="your.working@email.address"
  # make
  # sudo make install

During the build process, Bio::WormBase.pm creates a configuration file specific for your machine. This configuration file is located at

 /usr/local/wormbase/update_scripts/conf/[hostname].cfg

You shouldn't need to edit any of these values in this file but take a look at the options to make sure they are appropriate for your installation. the scripts supplied with Bio::WormBase.pm will automatically select this configuration file. If you make modifications to this file, you should back it up to prevent it from being overwritten by subsequent updates to Bio::WormBase.pm. You can override the default configuration file used by any script by passing the '--config' option along with the path to the configuration file.

Using wb_update_wormbase.pl to update the installation

To update the installation:

$ wb_update_wormbase.pl > /usr/local/wormbase/logs/update.log

or (if you want to walk away from it):

$ nohup wb_update_wormbase.pl > /usr/local/wormbase/logs/update.log &

Note that '>' will blow away any preexisting version of update.log. If this is a problem, use '>>'.

This script will check for a newer release of the database. If present, it will

  • download and install a new version of the acedb database
  • download and install elegans GFF databases
  • download and install briggsae GFF databases
  • download and install remanei GFF databases
  • update the software via rsync
  • clear the on disk cache
  • purge the temporary directory
  • purge old releases of the database to save disk space

Options:
If you aren't hosting blast, blat, or e-PCR on your site, you might want to omit these from the update process. You can do this by:

$ wb_update_wormbase.pl --excludes databases

To force installation to a particular release:

$ wb_update_wormbase.pl --version [VERSION] --force

Note:
You do not need to be root to execute this command IF:

  1. You are a member of the mysql group AND the mysqldatadir is group writable
  2. You are a member of the acedb group AND the acedb directory is group writable
  3. You will still need to restart sgifaceserver, mysqld, and httpd following an update to a new database!

Running under cron
I suggest running this script under cron:

1 1 * * * /usr/local/bin/wb_update_wormbase.pl > /usr/local/wormbase/logs/update.log

Auxiliary scripts

Bio::WormBase.pm includes a number of useful auxiliary scripts. In particular, you can easily execute any single step of the update process using these scripts.

  • wb_check_versions.pl

Determine current live, development, and local versions.

  • wb_clear_cache.pl

Clear the WormBase on disk cache

  • wb_get_development_version.pl

Get the current development version

  • wb_get_live_version.pl

Get the current live version

  • wb_get_local_version.pl

Get the current local version

  • wb_install_component.pl

Install a specific component of a WormBase site. eg: wb_update_wormbase.pl --component acedb --version WS155

  • wb_purge_old_releases.pl

Purge old releases from your machine. eg wb_purge_old_releases.pl --version WS155 will purge old releases up to (but not including) WS155.

  • wb_purge_tmp_dir.pl

Purge the temporary staging directory used during updates.

  • wb_restart_servers.pl

UNDER DEVELOPMENT

  • wb_rsync_software.pl

Update the Wormbase software

  • wb_update_wormbase.pl

The main update script described above

Installing Analog and Report Magic (optional)

If you are running an offical mirror site or would like to analyze accesses to your server, you should also install Analog and ReportMagic. These software packages will be used to automatically analyze the access logs on a running basis.

Fetch analog:

% wget http://www.analog.cx/analog-6.0.tar.gz
% tar xzf analog-6.0.tar.gz
% cd analog-6.0
% make
% cd ../

(Note: sometimes wget of analog-6.0.tar.gz will hang. In that case, just use your Web browser to get the package onto your desktop computer, then scp it to your WormBase mirror computer.)

Copy analog to the primary WormBase root:

% cp -r analog-6.0 /usr/local/wormbase/util/log_analysis/.

Install Report Magic

% wget http://www.reportmagic.org/rmagic-2.21.tar.gz
% tar xzf rmagic-2.21.tar.gz

Edit Install.PL to place the report magic files in /usr/local/wormbase. Replace:

$DEST = '/usr/local/bin/rmagic-${VERSION}/';
with
$DEST = '/usr/local/wormbase/util/log_analysis/rmagic/';
# And install...
 % sudo perl Install.PL
 

Configuring the WormBase installation

Although most files are delivered through the update mechanism, you will need to customize several files. Templates for each file are provided.

Acedb configuration files - /usr/local/elegans/wspec/* (and /usr/local/wormbase/wspec/*)

The template files found in wspec are used to store Acedb password and configuration options. You will need to update these templates after running the update script initially in order to get Acedb up and running. Once copied to a secure location at /usr/local/wormbase/wspec, then, following subsequent invocations of the update script, these files will automatically be re-copied to the new acedb database at /usr/local/acedb/elegans/wspec/.

  • passwd.wrm - controls local access to the acedb data files

This file contains the account names of local users who have write access to the database. If the site software has been installed properly, a fresh copy of this file should be available in the newly installed /usr/local/acedb/elegans database:

   $ ls -lt /usr/local/acedb/elegans/wspec/passwd.wrm

First, set up a /usr/local/wormbase/wspec directory (which will not have been created by the site software), and back the original version up:

   $ mkdir /usr/local/wormbase/wspec
   $ chown root:acedb /usr/local/wormbase/wspec
   $ chmod 2775 /usr/local/wormbase/wspec
   $ cp -p /usr/local/acedb/elegans/wspec/passwd.wrm /usr/local/wormbase/wspec/orig_passwd.wrm_file

Then edit /usr/local/acedb/elegans/wspec/passwd.wrm with your favorite text editor, e.g.:

   $ pico /usr/local/acedb/elegans/wspec/passwd.wrm

Erase the administrator name which is already there (e.g., users "lstein", "todd", and "nchen") and replace it with the administrator's (your!) login name(s). If you want to be able to update the database via the socket server, add user "acedb" to the list:

  // passwd.wrm
  your_name
  acedb

Finally, back the revised version up:

   $ cp -p /usr/local/acedb/elegans/wspec/passwd.wrm /usr/local/wormbase/wspec/passwd.wrm
  • serverpasswd.wrm - controls remote access to the acedb data files

This file contains usernames and passwords for those who have write access to the database. Again:

   $ cp -p /usr/local/acedb/elegans/wspec/serverpasswd.wrm /usr/local/wormbase/wspec/orig_serverpasswd.wrm_file
   $ pico /usr/local/acedb/elegans/wspec/serverpasswd.wrm   # or 'vi' or 'emacs'

Erase these lines:

 admin: admin lstein
 write: lstein todd

Replace them with this line:

 admin: admin
 write: your_login1 your_login2

This means that someone who logs in with username "admin" and a valid password will be granted administrative access to the database (ability to change passwords and shut down the server). Someone who logs in with username "your_login1" or "your_login2" will have write access to the database.

You will now use the makeUserPasswd program to create some passwords. Each time you run this program, it will print out a password line, which should be manually copied and pasted into the bottom of serverpasswd.wrm:

 % /usr/local/acedb/bin/makeUserPasswd admin
 // Please enter passwd: *******
 // Please re-enter passwd: *******
 // The following line is a valid entry for wspec/serverpasswd.wrm
 admin 5b11966a419e057ef0b7b917746e934c

You should do this once for the administrator, and once for each of the users who have write access; then add these lines to the bottom of serverpasswd.wrm. When you are done, it will look like this:

 // serverpasswd.wrm
 admin: admin
 write: your_login1 your_login2
 admin 5b11966a419e057ef0b7b917746e934c
 your_login1 2640075535f3fe296b6797d77bd6a714
 your_login2 05db4280d9f3b1c1aa6e10479aef4243

Finally, back up your work so you don't have to keep redoing it:

   $ cp -p /usr/local/acedb/elegans/wspec/serverpasswd.wrm /usr/local/wormbase/wspec/serverpasswd.wrm
  • server.wrm - legacy configuration file for the RPC server

No extra configuration needed.

  • serverconfig.wrm - configuration information for the socket server

No extra configuration needed.

Testing AceDB

At this point, you can test whether the socket server runs correctly. Provided that you have added yourself to the acedb group, you can run the following command:

 % ~acedb/bin/sgifaceserver ~acedb/elegans
 // Database directory: /usr/local/acedb/elegans
 // Shared files: /usr/local/acedb
 // #### Server started at 2001-07-23_16:42:31
 // #### host=mondseer.cshl.org  listening port=23100
 // #### Database dir=/usr/local/acedb/elegans
 // ####  Working dir=/usr/local/acedb/elegans
 // #### clientTimeout=600 serverTimeout=600 maxKbytes=0 autoSaveInterval=600
 // Server listening socket 28 created

The line "listening port=23100" indicates that the server is listening to port 23100. Open a new terminal window and use saceclient to confirm that you can communicate with the server:

% ~acedb/bin/saceclient localhost -port 23100
Please enter userid: anonymous
Please enter passwd:
acedb@localhost> find Sequence
// Response: 65 bytes.
// Found 236493 objects in this class
// 236493 Active Objects
acedb@localhost> quit
// Closing connection to server.
// Client sent termination signal by server.
// Response: 13 bytes.
// A bientot
// Please report problems to acedb@sanger.ac.uk
// Bye

The command-line syntax for saceclient is "saceclient <host> -port <port>". When prompted for the userid, enter "anonymous" and just hit return when prompted for a password. We then issued a "find Sequence" command to count the number of sequences in the database (a lot), and "quit" to terminate the connection.

Now test that the admin password works:

~acedb/bin/saceclient localhost -port 23100
Please enter userid: admin
Please enter passwd: ******
acedb@localhost> shutdown now
// Client sending shutdown to server
// Client sent termination signal by server.
// Response: 87 bytes.
// 0 Active Objects
// Sorry, emergency shutdown of server now executing
// A bientot
// Please report problems to acedb@sanger.ac.uk
// Bye

When prompted for the userid, we entered "admin" and gave the correct password. The command "shutdown now" causes the server to exit. If we did not have administrative privileges, we would have gotten an "unknown command" error at this stage.

Apache configuration - ~/wormbase/conf/httpd.conf vs. ~/apache/conf/httpd.conf

There are two httpd.conf files to contend with: one in the main directory for the Apache server (/usr/local/apache) and one in the site software for WormBase (/usr/local/wormbase). Getting the Apache configuration right entails getting various directives in these two files to properly mesh. Generally speaking, you want to set things up so any request specific for the WormBase site goes to /usr/local/wormbase/conf/httpd.conf, but also that any HTTP request meant for general purposes can still be handled by the main software and configurations at /usr/local/apache/conf/httpd.conf.

In the prototype file /usr/local/wormbase/conf/httpd.conf.template, you will find an Apache configuration file containing WormBase-specific definitions. These definitions will include:

   CustomLog    /usr/local/wormbase/logs/access_log combined_format
   <Directory /usr/local/wormbase/html> ... </Directory>
   DocumentRoot /usr/local/wormbase/html
   ErrorLog     /usr/local/wormbase/logs/error_log
   LogFormat [various...]
   ServerAdmin webmaster@wormbase.org                         # change to your preferred e-mail address

These and other directives in /usr/local/wormbase/conf/httpd.conf will do the following:

  1. Set /usr/local/wormbase/html to be the document root for static HTML files.
  2. Set /usr/local/wormbase/db to be a script directory under the control of mod_perl's Apache::Registry.
  3. Create transfer and error logs in /usr/local/wormbase/logs
  4. Create an ordinary cgi-bin directory in /usr/local/wormbase/cgi-bin
  5. Put all static .html files under the control of Apache::AddWormbaseBanner, a module that appends the standard WormBase header and footer on all HTML files.

The simplest approach would be to cut-and-paste this file into the main configuration file (/usr/local/apache/conf/httpd.conf), replacing the directives already there. The approach recommended here, though, is to set up a WormBase-specific httpd.conf file:

   $ cp /usr/local/wormbase/conf/httpd.conf.template /usr/local/wormbase/conf/httpd.conf

Then use an Include directive in /usr/local/apache/conf/httpd.conf to import the WormBase-specific directives as needed. For instance, if WormBase is going to be the only website hosted by this server, then comment out all DocumentRoot and <Directory> sections, e.g.:

   #DocumentRoot "/usr/local/apache/htdocs"

from the main configuration file /usr/local/apache/conf/httpd.conf. Also, check that the Port directive in /usr/local/apache/conf/httpd.conf is "Port 80" (it should already be). Then, insert the following at the bottom of /usr/local/apache/conf/httpd.conf:

   Include /usr/local/wormbase/conf/httpd.conf

Conversely, if WormBase is going to be a virtual host (one of several web sites hosted by the server), then create an appropriate <VirtualHost> section in /usr/local/apache/conf/httpd.conf. Here is a template to follow:

   <VirtualHost *:80>
     ServerName wormbase.hostname.org                   # e.g., caltech.wormbase.org
     UseCanonicalName on
     Include /usr/local/wormbase/conf/httpd.conf
   </VirtualHost>
   <VirtualHost *:80>
     ServerName your-non-wormbase.hostname.org
     UseCanonicalName on
     Include /usr/local/apache/conf/local_httpd.conf    # copy base configuration information to here
   </VirtualHost>

For more details see the Apache documentation on virtual hosts. The IP address in the <VirtualHost> tag must be replaced by the correct IP address for the server. Likewise, the ServerName must be replaced by a DNS name that will correctly resolve to this IP address. N.B.: do not use "www.wormbase.org"! This name is already taken.

You must also set up a file for perl definitions:

   $ cp -p /usr/local/wormbase/conf/perl.startup.template /usr/local/wormbase/conf/perl.startup

However you end up applying it, /usr/local/wormbase/conf/httpd.conf needs a few small adjustments before use. First, change the filler name "your.machine.org" to the actual hostname of your server (or to an appropriate alias, such as "caltech.wormbase.org"); do this both for ServerName and at all other sites throughout the file. Likewise, change ServerAdmin to the e-mail address of whoever will actually be responsible for administering the mirror.

Also, update the location of the staging directory for dynamically-generated images to one suitable for your installation. This involves the following directive:

 Alias /ace_images  /var/tmp/ace_images

The ace_images directory will be created automatically the first time WormBase needs it, but the directory that it contains, in this case /var/tmp, must be writable by the Apache user (usually "nobody"). The images will eventually occupy approximately 10 megs. If /var/tmp is not appropriate for your system, change the second argument to some location that is more suitable.

WormBase configuration - wormbase/conf/elegans.pm and localdefs.pm

elegans.pm

WormBase uses two main configuratation files, elegans.pm and localdefs.pm, located at /usr/local/wormbase/conf.

The first, elegans.pm, contains a variety of Perl definitions that are used by the various WormBase mod_perl scripts. You will want to look through this file, but you probably will not need to make any changes. The sole item you might wish to change controls the location of temporary files:

@PICTURES

This is the location of a temporary staging directory for dynamically-generated images as indicated in conf/httpd.conf. Its value is a list in which the first item is where the images will appear on the Web server (in URL space) and the second item is where they will appear on the filesystem:

  @PICTURES = ('/ace_images' => '/var/tmp/ace_images');

If you changed the location of the staging directory in httpd.conf, you must make the corresponding change here.

localsdef.pm

The second file, localdefs.pm, contains site-specific hostnames, ports, and passwords. You will find a template for this file at /usr/local/wormbase/conf/localdefs.pm.template. Copy this file to localdefs.pm and edit the following options as appropriate for your site.

  • $HOST

This is the name of the host where the socket server runs. It is set to "localhost" by default.

  • $PORT

This is the port on which the socket server runs, 2005 by default.

  • $ACEPASS, $USERNAME, $PASSWORD

These three items define the acedb username and password.

  • $MYSQL_HOST, $MYSQL_USER, $MYSQL_PASS

These three items define the mysql host, username, and password.

  • $MASTER

This is used only for the WormBase master site. Should be set to 0.

  • $MIRROR

Whether or not the site is a mirror. Should be set to the name of the mirror.

  • $DEVELOPMENT

Whether or not the site is a development site. Internally, this controls the nature of caching on the site. Should be set to 0.

  • $BLAST2WORMBASE, $WORMBASE2BLAST

These two options control where the blast script directs queries, and where those queries are returned. This is provided in the event that a second standalone blast server is provided. If not, these two options should point to:

$WORMBASE2BLAST=http://your.hostname.org/

Configuring Servers To Start Automatically

The final step is to arrange for Acedb to start automatically and for MySQL to restart if necessary.

Installing MySQL and BLAT monitoring scripts

Run:

   $ cp -i /usr/local/wormbase/util/admin/blat_server.initd /etc/rc.d/init.d/blat_server

Then run:

   $ crontab -u root -e

to add the following entries to root's crontab:

   0 * * * * /usr/local/wormbase/util/admin/restart_mysqld.pl
   0 * * * * /usr/local/wormbase/util/admin/restart_blat.pl

Acedb log rotation

Acedb generates massive log files. To keep these from growing too large, add the following entry to root's crontab (or that of another privileged user):

   10 1 * * * /usr/local/wormbase/bin/rotatelogs.pl

Configuring ACeDB to start automatically under inetd

Some, but not all, UNIX and Unix-like operating systems use inetd to automatically start background processes; in such a system, inetd must be configured to launch the acedb server. (Other systems such as Fedora Linux will use xinetd.)

Where inetd is used, locate the file /etc/inetd.conf, and add the following line to the end:

   2005 stream tcp wait acedb /usr/local/acedb/bin/sgifaceserver sock.acedb /usr/local/acedb/elegans  1200:1200:0

Note that this line may be wrapped into two or more lines when being viewed; but, in the real configuration file, this must be a single unbroken, unwrapped line. The first column indicates the port number to listen to. 2005 is the default used by the WormBase configuration files.

Tell inetd to reload its configuration file by sending it a HUP signal.

   % ps -elf | grep inetd
   140 S root       500     1  0  60   0    -   329 do_sel Jul17 ? inetd
   % killall -HUP 140

You should now be able to talk to the database using saceclient (as a anonymous user):

   % ~acedb/bin/saceclient localhost -port 2005

Configuring Acedb to start automatically under xinetd

First, make sure that xinetd is actually installed (look for the presence of /usr/sbin/xinetd). If not, use the RPM manager (gnorpm or equivalent) to install the xinetd package.

Modern versions of xinetd (e.g., xinetd-2.3.13-6) distributed with Fedora Core 4 should work well. If for some reason you are using an old version (<2.3.7), though, you must upgrade to at least xinetd-2.3.7-4.7x.

Once xinetd is installed, you will now need to create an xinetd configuration file for Acedb. Create a new file named /etc/xinetd.d/acedb with the following contents:

# file: /etc/xinetd.d/acedb
 # default: on
 # description: wormbase acedb database
 service acedb
 {
        protocol                = tcp
        socket_type             = stream
        port                    = 2005
        flags                   = REUSE
        wait                    = yes
        user                    = acedb
        group                   = acedb
        log_on_success          += USERID DURATION
        log_on_failure          += USERID HOST
        server                  = /usr/local/acedb/bin/sgifaceserver
        server_args             = /usr/local/acedb/elegans 1200:1200:0
 }
 

Edit /etc/services. Although xinetd is not supposed to use /etc/services, the following line must be added:

acedb           2005/tcp

Restart xinetd with the following command:

# /etc/rc.d/init.d/xinetd reload (or restart)
 

To kill xinetd, first find the process id and then:

# kill -SIGUSR2 process#
 

You should now be able to talk to the database using saceclient:

% ~acedb/bin/saceclient localhost -port 2005

Note: to know if the server is listening at port 2005, run the following command:

 # netstat -ant | grep LISTEN
 or, for more readable output,
 # netstat -vatp | grep LISTEN

If an error occurs, check /var/log/messages, and the serverlog.worm and log.wrm files in the current database directory. Common errors include insufficient disk space and inapprorpriate permissions for the latter two log files. Remember, the acedb server must be able to write to these files.

Testing The Site

At this point, all components of a WormBase installation have been installed. You can test your installation by restarting the various server components of WormBase.

Restarting AceDB

# Via xinetd:
 $ /etc/rc.d/init.d/xinetd reload (or restart)
 
# Or to kill xinetd, first find the process id and then:
 $ kill -SIGUSR2 process#
 
# ...or using saceclient
 % saceclient localhost -port 2005
 acedb> password:
 acedb> shutdown now
 

MySQL

# Via mysqladmin...
 % mysqladmin -uroot -pPASSWORD shutdown
 
 # or using init.d
 $ /etc/rc.d/init.d/mysql restart
 

Apache

When the configuration files have been checked and adjusted, restart Apache with the following command:

 $ /usr/local/apache/bin/apachectl restart

Look in /usr/local/wormbase/logs/error_log (Wormbase-specific errors) and /usr/local/apache/logs/error_log (general errors) for any error messages. If there are none, try fetching the main page. You should see a WormBase banner and footer. The various database searches should also work.

BLAT

% /usr/local/blat/bin/gfServer start localhost 2003 \
     /usr/local/wormbase/blat/*.nib & > /dev/null 2>&1

Blocking robots

It can be useful to block search engines (such as Google) from crawling over one's mirror. To do this, go to /usr/local/wormbase/html, and make a file called "robots.txt" with the following contents:

   User-agent: *
   Disallow: /

Troubleshooting

There are a number of common problems to check:

Is the acedb socket server starting?

Run "ps" to determine whether the server is indeed starting. If not, go back to the acedb configuration section and confirm that everything is where it should be. Make sure that the /usr/local/acedb/elegans/database directory is writable by the acedb user.

The two acedb logs to check for error messages are both in /usr/local/acedb/elegans/database. Examine log.wrm and serverlog.wrm.

Is the acedb socket server crashing?

It is possible that the server is crashing soon after it starts. The symptom of this is that the system gets very busy for a while, and "top" or "ps" shows the server restarting repeatedly. Eventually inetd (or xinetd) will disable the server and issue a syslog message to the effect that it is disabling a "looping" service.

Again, check that acedb is installed properly and that the database directory is writable. Check log.wrm and serverlog.wrm.

"Internal Server Error"

This is typically a symptom that mod_perl isn't installed correctly, a required Perl library is missing, or something is wrong with the configuration. Check the two error_log files (in /usr/local/apache/log and /usr/local/wormbase/log) for clues.

The banner displays but the decorative worm images are broken

On some versions of Linux running the libc 2.2 library there is a bug in readdir(), which is the function called to read the contents of a directory. You can check what version of glibc you have by looking at the contents of /lib:

% ls -l /libc-*
-rwxr-xr-x    1 root  root  4101324 Feb 29  2000 /lib/libc-2.1.3.so*

Versions that are at risk will show libc-2.2.so installed. The solution is to upgrade to a more recent version of libc. libc 2.2.3 is known to work correctly.

If you are stuck, send copies of the error logs and anything else you think might be useful to lstein@cshl.org and I'll try to help.

Installing scripts to verify that the servers are running

Two scripts in the WormBase directory can be used to ensure that the mysql and blat servers are running. To install, them:

% sudo cp /usr/localwormbase/util/admin/blat_server.initd \
          /etc/rc.d/init.d/blat_server

Place the restart scripts under cron control of a privileged user. These commands will check every hour to see that the servers are running.

 % sudo crontab -u root -e
0 * * * * /usr/local/wormbase/util/admin/restart_mysqld.pl
0 * * * * /usr/local/wormbase/util/admin/restart_blat.pl

At the same time, you might also wish to automate the rotatation of logs to prevent them from growing to an unwieldy size. You'll find an appropriate log rotation configuration stanza in util/rotate_wormbase_logs and a log rotate script in /usr/local/wormbase/bin/rotatelogs.pl. You will need both.

# Rotate httpd logs
 10 1 * * * /usr/local/wormbase/bin/rotatelogs.pl
 # Rotate acedb logs
 10 1 * * * logrotate /usr/local/wormbase/util/rotate_wormbase_logs
 

This stanza will check that the acedb server logs do not grow larger than 100 MB.

INSTALL libgd and GD.pm (if installation of GD.pm failed)

libgd

Try first to install it with the default package management system of yout distro/UNIX. If that fails you might want to resort to manual installation from source:

Fetch and install libgd:

 # curl -O http://www.boutell.com/gd/http/gd-2.0.33.tar.gz
 # tar xzf gd-2.0.33.tar.gz
 # cd gd-2.0.33
 # ./configure
 # make
 # sudo make install

GD.pm

try the usual cpan way, or:

 # curl -O http://stein.cshl.org/WWW/software/GD/GD.pm.tar.gz
 # tar xzf GD.pm.tar.gz
 # perl Makefile.PL -L/usr/local   // or export LD_LIBRARY_PATH=/usr/local
 # make
 # sudo make install

If you ever happen to want to link against installed libraries in a given directory, LIBDIR, you must either use libtool, and specify the full pathname of the library, or use the `-LLIBDIR' flag during linking and do at least one of the following:

  - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
    during execution
  - add LIBDIR to the `LD_RUN_PATH' environment variable
    during linking
  - use the `-Wl,--rpath -Wl,LIBDIR' linker flag
  - have your system administrator add LIBDIR to `/etc/ld.so.conf'

DO I NEED TO REMOVE OLD VERSIONS OF GD FIRST?

 Finally reverted to 1.19!  Yikes!
 cpan> install LDS/GD-1.19.tar.gz

Install XML libraries and modules

The XML parsing libraries faciliate XML dumps from WormBase. They are not strictly required for a Wormbase installation.

expat

As with libgd, try to use your package manager, or:

 % curl -O http://umn.dl.sourceforge.net/sourceforge/expat/expat-1.95.6.tar.gz
 % gnutar -zxf expat-1.95.6.tar.gz
 % pushd expat-1.95.6/
 % ./configure
 % make
 % sudo make install
 % popd

Maintaining a mirror site

See the related document [HowTo_Maintain_A_Wormbase_Mirror] for full details on the update process and how to keep your installation up-to-date automatically via a cron job.

Bits and Pieces

wget "ftp://ftp.wormbase.org/pub/wormbase/misc/images/**

Tweaking the website

In case you are running out of memory on your webserver you can try:

  • remove the google-bot from html/robots.txt
    • PLUS less load and no spikes of memory usage
    • MINUS the dynamic pages will not be indexed by google

something like


User-agent: *
Disallow: /cgi-bin
Disallow: /perl
Disallow: /mailarch
Disallow: /db
  • limiting the effect of memory leakage using httpd.conf
    • MinSpareServers Y (min. number of idle apache processes hanging around)
    • MaxSpareServers Z (max. number of idle apache processes hanging around)
    • MaxClients XX (if there are more connections, they will not get a Server unavailable)
    • MaxRequestsPerChild NN (for leakage prevention - a rough housenumber is 75)
    • MaxKeepAliveRequests XY (for leakage prevention - a rough housenumber is 75)

Apache will take its time to harvest idle processes, so be sure to use MaxClients if memory is a problem.

Don't allow more clients than you have space in memory(RAM+swap) on your server, as the OS will most probably kill a random process when it runs out of memory.

  • Use of a front-side cache

As example squid in web accelerator mode. That will drastically reduce the response time for cached pages and remove load from the Apache. You can always hit some often used pages with a script after mirroring new data.

COPYRIGHT INFORMATION

Material in this document is copyright 2003-2005 by the California Institute of Technology, Cold Spring Harbor Laboratory, Washington University at St. Louis, and The Wellcome Trust Sanger Institute. This information is provided "AS-IS" without any warranty, expressed or implied.

AUTHOR

Todd Harris (harris@cshl.edu)