Difference between revisions of "Administration:WormBase Production Environment"

From WormBaseWiki
Jump to navigationJump to search
 
(137 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
DEPRECATED!
 +
 +
 +
 
=Overview=
 
=Overview=
  
The WormBase production environment consists of a series of a http servers glued to our webapp, all sitting behind a load-balancing reverse-proxy server (nginx).
+
The WormBase production environment consists of a series of partially redundant web and database servers, most sitting behind a load-balancing reverse-proxy server running nginx.  This document describes the basic setup and configuration of this environment.
  
Each cluster node runs the lightweight HTTP server starman on port 5000. These nodes are accessible ONLY to the front end proxy.
+
''' Reverse proxy node '''
  
This document describes the configuration of individual web nodes and the reverse proxy server.
+
:Two servers each running nginx as a load-balancing reverse proxy.  Built in memcached support establishes a memory cache amongst all back end web server nodes.  Requests are distributed in round-robin fashion.
  
 +
nginx
 +
acedb
 +
starman_webapp
 +
mysql userdb
  
 +
''' Web server nodes '''
  
 +
:Each web cluster node runs the lightweight HTTP server starman listening on port 5000.  This http server is glued via PSGI/Plack/Starman to our Catalyst web application.
  
TODO: Merge this document with the general website installation documents
+
:Currently, each node is -- with the exception of GBrowse -- almost entirely independent, with its own AceDB and MySQL databases.
  
http://wiki.wormbase.org/index.php/Installing_the_Web_Application (deprecated but some morsels)
+
:Web cluster nodes are accessible ONLY to the front end proxy.
  
Possibly merge this document with the development Environment document
+
''' Data mining nodes '''
 +
 
 +
''' Social feature node '''
  
 
= To resolve =
 
= To resolve =
  
* How is the back end node hosting the user database specified?
+
* How/where is the back end node hosting the user database specified?
 
 
* Where are log paths specified? These need to be consolidated.
 
* image caching
 
* memcache
 
 
* differences in configuration files.
 
* differences in configuration files.
* set up starman on beta.wormbase.org
 
 
  
 
nginx
 
nginx
 
* ssl
 
* ssl
 
* proxy caching
 
* proxy caching
* serving up of static content
 
* memcache
 
 
  
 
* to test
 
* to test
 
* logging in
 
* logging in
 
* browser compatibility
 
* browser compatibility
 +
* set automatic updates of code and restarting of services
  
* set up starman on dev
+
= Nodes =
 
 
* set automatic updates of code and restarting of services.
 
 
 
 
 
 
 
 
 
 
 
Paper:
 
* Unified paper interface
 
* Longer term:
 
* Paper
 
*    Overivew
 
    Laboratory: strain designation
 
    Fetch all strains for a given lab
 
 
 
 
 
http://en.wikipedia.org/wiki/VCard
 
  
 +
== Reverse Proxy Node ==
  
Webserver nodes as described in the Installing WormBase document.
+
=== nginx ===
 
 
 
 
= Logs =
 
 
 
All relevant logs can be found at:
 
 
 
ls /usr/local/wormbase/logs
 
nginx-error.log    // The reverse proxy error log
 
nginx-access.log  // The reverse proxy access log
 
nginx-cache.log    // The reverse proxy cache log
 
catalyst_error.log // The catalyst error log
 
 
 
= Reverse Proxy and Load Balancing via nginx=
 
  
== Installation ==
+
====Installation====
  
 
We'll place nginx entirely within the wormbase root directory.  It's configuration and init files are maintained in the wormbase-admin module.
 
We'll place nginx entirely within the wormbase root directory.  It's configuration and init files are maintained in the wormbase-admin module.
  
1. Install prerequisites
+
'''1. Install prerequisites'''
  
   # Perl Compatabile Regular Expression libaray
+
   # Perl Compatabile Regular Expression library, or for EC2, yum
   sudo apt-get install libpcre3 libpcre3-dev
+
   sudo apt-get install libpcre3 libpcre3-dev libssl-dev libc6-dev  
  
   # Fetch and unpack openssel
+
   # Fetch and unpack openssl
 
   wget http://www.openssl.org/source/openssl-0.9.8p.tar.gz
 
   wget http://www.openssl.org/source/openssl-0.9.8p.tar.gz
 
   tar -zxf openssl-0.9.8p.tar.gz
 
   tar -zxf openssl-0.9.8p.tar.gz
  
2. Get the nginx cache-purge module
+
'''2. Get the nginx cache-purge module'''
 
   cd src/
 
   cd src/
   curl -O http://labs.frickle.com/files/ngx_cache_purge-1.2.tar.gz
+
   curl -O http://labs.frickle.com/files/ngx_cache_purge-1.3.tar.gz
   tar xzf ngx_cache_purge-1.2.tar.gz
+
   tar xzf ngx_cache_purge-1.3.tar.gz
  
3. Build and install nginx
+
'''3. Build and install nginx'''
   curl -O http://nginx.org/download/nginx-0.8.53.tar.gz
+
   curl -O http://nginx.org/download/nginx-1.0.14.tar.gz
 
   tar xzf nginx*
 
   tar xzf nginx*
 +
  cd nginx*
 
   ./configure \
 
   ./configure \
     --prefix=/usr/local/wormbase/services/nginx-0.8.53 \
+
     --prefix=/usr/local/wormbase/services/nginx-1.0.14 \
 
     --error-log-path=/usr/local/wormbase/logs/nginx-error.log \
 
     --error-log-path=/usr/local/wormbase/logs/nginx-error.log \
 
     --http-log-path=/usr/local/wormbase/logs/nginx-access.log \
 
     --http-log-path=/usr/local/wormbase/logs/nginx-access.log \
Line 109: Line 85:
 
     --with-http_gzip_static_module \
 
     --with-http_gzip_static_module \
 
     --with-http_secure_link_module \
 
     --with-http_secure_link_module \
    --with-openssl=../openssl-0.9.8p \
+
      --with-openssl=../openssl-0.9.8p \
    --add-module=../ngx_cache_purge-1.2
+
      --add-module=../ngx_cache_purge-1.3
 
     make
 
     make
 
     make install
 
     make install
 
     cd /usr/local/wormbase/services
 
     cd /usr/local/wormbase/services
     ln -s nginx-0.8.53 nginx
+
     ln -s nginx-1.0.14 nginx
 
 
4. Symlink the configuration directory
 
  
cd /usr/local/wormbase/services/nginx
+
    cd /usr/local/wormbase/services/nginx
mv conf conf.original
+
    mv conf conf.original
ln -s /usr/local/wormbase/admin/conf/nginx conf
+
    ln -s /usr/local/wormbase/website-admin/nginx/production conf
  
5. Test the configuration file syntax by:
+
'''4. Test the configuration file syntax by:'''
  
 
  $ nginx -t
 
  $ nginx -t
Line 129: Line 103:
 
http://nathanvangheem.com/news/nginx-with-built-in-load-balancing-and-caching
 
http://nathanvangheem.com/news/nginx-with-built-in-load-balancing-and-caching
  
== Load Balancing ==
+
''' About Load Balancing'''
  
 
nginx relies on the NginxHttpUpstreamModule for load balancing.  It's built-in by default.  The documentation contains a number of possibly useful configuration directives:
 
nginx relies on the NginxHttpUpstreamModule for load balancing.  It's built-in by default.  The documentation contains a number of possibly useful configuration directives:
Line 139: Line 113:
 
   http://wiki.nginx.org/3rdPartyModules
 
   http://wiki.nginx.org/3rdPartyModules
  
== Starting the Server ==
 
  
Copy wormbase-admin/conf/nginx.init to /etc/init.d/nginx to setup a suitable init script. (Re)start the server by:
+
5. Generate SSL certificates
 +
 
 +
To generate private (dummy) certificates you can perform the following list of openssl commands.
  
$ /etc/init.d/nginx restart
+
First change directory to where you want to create the certificate and private key:
  
== Set nginx to start at server launch ==
+
$ cd /usr/local/wormbase/shared/services/nginx/conf
  
 +
Now create the server private key, you'll be asked for a passphrase:
 +
$ openssl genrsa -des3 -out server.key 1024
 +
 +
Create the Certificate Signing Request (CSR):
 +
$ openssl req -new -key server.key -out server.csr
 +
 +
Remove the necessity of entering a passphrase for starting up nginx with SSL using the above private key:
 +
 +
$ cp server.key server.key.org
 +
$ openssl rsa -in server.key.org -out server.key
 +
 +
Finally sign the certificate using the above private key and CSR:
 +
$ openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt
 +
 +
Update Nginx configuration to point to the newly signed certificate and private key.
 +
 +
==== Set nginx to start at server launch ====
 +
 +
sudo cp /usr/local/wormbase/website-admin/nginx/conf/production/nginx.init /etc/init.d/nginx
 
   sudo /usr/sbin/update-rc.d -f nginx defaults
 
   sudo /usr/sbin/update-rc.d -f nginx defaults
  
Line 160: Line 154:
 
   /etc/rc5.d/S20nginx -> ../init.d/nginx
 
   /etc/rc5.d/S20nginx -> ../init.d/nginx
  
= Webserver Nodes =
 
  
== PSGI/Plack + Starman ==
+
==== Set up nginx log roation ====
  
PSGI: specification for Perl superglue between frameworks and servers.  Plack is an implementation of PSGI. Compare to Rack (Ruby) or Jack (Javascript).
+
  30 1 * * * /usr/local/wormbase/website-admin/log_analysis/rotate_nginx_logs.pl
  
  http://plackperl.org/
+
==== Starting the Server ====
  
Starman is a high performance pre-forking Perl PSGI server. We're using it in place of Apache+fastcgi.
+
sudo /etc/init.d/nginx restart
  
  https://github.com/miyagawa/Starman
+
=== memcached ===
  
Install Plack:
+
'''Install'''
:Catalyst::Controller::Metal
 
:Catalyst::Engine::PSGI
 
:Catalyst::Helper::PSGI
 
:Plack::Test::Adopt::Catalyst
 
  
Install Plack:
+
$ sudo apt-get install memcached
:cpanm Task::Plack
 
:cpanm Starman
 
  
=== Configuration ===
+
'''Configure'''
  
The PSGI glue resides at:
+
''Make memcached listen to all IP addresses, not just requests from localhost:'''
  
  script/wormbase_psgi.psgi
+
  $ emacs /etc/memcached.conf
 +
-l 127.0.0.1  <--- comment out this line
  
=== Starting Starman ===
+
'''Lock down access to memcached via iptables as described below.'''
  
  starman script/wormbase_psgi.psgi
+
  memcached installed    memcached configed    iptables config
  OR
+
  web1         
  starman -MFindBin script/wormbase_psgi.psgi
+
  web2 configured
 +
web3
 +
web4
 +
mining
  
 +
=== Configure iptables ===
  
=== Set up starman to launch automatically ===
+
    ##################################################       
 +
    # New architecture
 +
    #                                                               
 +
    # Proxy nodes:
 +
    #    service  port                accessibility
 +
    #    nginx    2011, later on 80    all                                                           
 +
    #    memcached 11211                cluster
 +
    #    starman   5000                localhost
 +
    #    mysql    3306                cluster, dev                                               
 +
    # 
 +
    # Backend nodes                                                                       
 +
    #    service  port                accessibility                                                       
 +
    #    starman  5000                proxy nodes                                           
 +
    #    memcached 11211                cluster                                   
 +
    #                                                             
 +
    ##################################################
  
Copy the stub init script from wormbase/conf/starman/starman.init:
+
    # Proxy
 +
    # The new website front-end proxy, accessible to the world                                                         
 +
    $BIN -A INPUT -p tcp --dport 2011 -m state --state NEW -j ACCEPT
 +
    # Open MySQL to other production nodes; need access to the sessions database                                     
 +
    $BIN -A INPUT -p tcp --dport 3306 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
 +
    # Open memcached
 +
    $BIN -A INPUT -p tcp --dport 11211 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
  
  cp /usr/local/wormbase/admin/conf/starman/starman.init
+
    # Backend machines
 +
    # starman
 +
    $BIN -A INPUT -p tcp --dport 5000 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
 +
    # memcached
 +
    $BIN -A INPUT -p tcp --dport 11211 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
  
= Memached/libmemcached =
+
    # Let me access backend services directly for debugging                                                                                 
 +
    # Starman, port 5000                                                                                                                   
 +
    $BIN -A INPUT -p tcp -s 206.228.142.230 --dport 5000 -m state --state NEW -j ACCEPT
 +
    # Old site httpd, port 8080                                                                                                             
 +
    $BIN -A INPUT -p tcp -s 206.228.142.230 --dport 8080 -m state --state NEW -j ACCEPT
 +
    # memcached                                                                                                                             
 +
    $BIN -A INPUT -p tcp -s 206.228.142.230 --dport 11211 -m state --state NEW -j ACCEPT
  
sudo apt-get install memcached
+
Then
 
wget http://download.tangent.org/libmemcached-0.44.tar.gz
 
tar -zxf libmemcached-0.44.tar.gz
 
cd libmemcached-0.4
 
./configure –prefix=/usr
 
make
 
make install
 
  
 +
/etc/init.d/iptables.local restart
  
 +
=== Launch services on the front end machine ===
  
MISC:
+
# nginx
 +
$ sudo /etc/init.d/nginx restart
  
 +
# memcached
 +
$ sudo /etc/init.d/memcached restart
  
  Crypt:
+
  # starman
  ssl
+
  cd /usr/local/wormbase/website/production/bin
  sudo apt-get install libssl-dev
+
  $ ./starman-production.sh start
  
  libc6
+
== Webserver Nodes ==
sudo apt-get install libc6-dev
 
  
 +
Webserver nodes mount an NFS share containing (almost) everything they need.
  
 +
A webserver node expects the following layout:
  
= The Web App =
+
/usr/local/wormbase
 +
/usr/local/wormbase/acedb
 +
/usr/local/wormbase/databases
 +
/usr/local/wormbase/extlib
 +
/usr/local/wormbase/services
 +
/usr/local/wormbase/website/production
 +
/usr/local/wormbase/website-shared-files
  
== Checking out and building the code ==
 
  
''For a new major release:''
+
An NFS share provides most of these elements, mounted at /usr/local/wormbase/shared, with symlinks as:
  
  ssh wb-dev
+
/usr/local/wormbase
  cd /usr/local/wormbase/website
+
  /usr/local/wormbase/acedb
  // Anonymously checkout the code. You will not be able to commit back...
+
/usr/local/wormbase/databases -> shared/databases
  hg clone ssh://hg@bitbucket.org/tharris/wormbase
+
  /usr/local/wormbase/extlib -> shared/extlib
  mv wormbase staging
+
  /usr/local/wormbase/services -> shared/services
 +
  /usr/local/wormbase/website/production -> shared/website/production
 +
  /usr/local/wormbase/website-shared-files -> shared/website-shared-files
  
''Build dependencies.''
+
Each webserver node hosts its own AceDB database and server, its own mysql database
cd staging
 
mkdir extlib
 
cd extlib
 
perl -Mlocal::lib=./
 
eval $(perl -Mlocal::lib=./)
 
cd ../
 
perl Makefile.PL
 
make installdeps
 
 
Update wormbase.env to read "APPNAME=production", then
 
source wormbase.env
 
  
''For an updated but minor release:''
+
Individual webserver nodes should be configured essentially as described in the [[Administration:Installing_WormBase|Installing WormBase]] documentation, except that they do not require nginx.
  
cd /usr/local/wormbase/website/staging
+
=== HTTP server: PSGI/Plack + Starman ===
hg incoming
 
hg pull -u
 
perl Makefile.PL
 
make installdeps    // make sure we are up-to-date
 
  
 +
See [[Administration:Installing_WormBase#Starman:_the_lightweight_http_server|Starman: the lightweight http server]] section in the Installing WormBase documentation.
  
== Build the user schema ==
+
=== Memached/libmemcached ===
  
''The website uses a mysql backend to store user preferences, browsing history, session data. This shouldn't ever need to be recreated (at least until we have a migration path in place from an old database to a new one!), but here's how to create it for reference.  For now, this database is hosted on web1.
+
See [[Administration:WormBase_Production_Environment#Memached.2Flibmemcached|above]] for details.
 
 
ssh web1
 
mysql -u root -p < /usr/local/wormbase/website/production/util/user_login.sql
 
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@localhost';
 
 
 
# All nodes currently use the same session database.
 
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web1.oicr.on.ca';
 
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web2.oicr.on.ca';
 
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web3.oicr.on.ca';
 
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web4.oicr.on.ca';
 
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-mining.oicr.on.ca';
 
  
== Open appropriate ports ==
+
=== The Webapp ===
  
nginx is listening on port web1:2011. Open this port via iptables.
+
The web app and all Perl libraries will be installed automatically by the deploy_wormbase_webapp.sh script.
  
  # The new website front-end proxy, accessible to the world                                                         
+
  /usr/local/wormbase/website/production -> WSXXXX-YYYY.MM.DD-X.XX-XXXX
  $BIN -A INPUT -p tcp --dport 2011 -m state --state NEW -j ACCEPT
+
  /usr/local/wormbase/website/WSXXX-YYYY.MM.DD-X.XX-XXXX
 +
 +
For details on installation of the web app itself, see the [[Administration:Installing_WormBase#Install_the_Webapp|Install The Webapp]] section of the main [[Administration:Installing_WormBase|Installing WormBase]] guide.
  
== Launch services on the front end machine ==
 
  
# nginx
+
=== Launch services on back end machines ===
/etc/init.d/nginx start
 
  
 +
# memcached
 +
$ sudo /etc/init.d/memcached restart
 +
 
  # starman
 
  # starman
 
  cd /usr/local/wormbase/website/production/bin
 
  cd /usr/local/wormbase/website/production/bin
 
  ./starman-production.sh start
 
  ./starman-production.sh start
  
== Launch services on back end machines ==
 
  
# starman
+
=== Hadoop Distributed File System (HDFS+Hoop) ===
cd /usr/local/wormbase/website/production/bin
+
 
./starman-production.sh start
+
We use the Hadoop Distributed File System to make it easier and faster to move files around.
 +
 
 +
Documentation on setting up Hadoop:
 +
 
 +
http://hadoop.apache.org/common/docs/current/single_node_setup.html
  
= Adjust iptables =
+
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
  
We need to open port 8000, which should only be to the squid.
+
Install HDFS on each node.
  
sudo emacs /etc/init.d/iptables
+
http://mirror.olnevhost.net/pub/apache//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
  
Add the following:
 
  
  # The new website runs on port 8000. It SHOULD only be accessible by squid                                                             
+
Start the cluster:
  $BIN -A INPUT -p tcp --dport 8000 -m state --state NEW -j ACCEPT
+
  # ... or only accessible by proxy                                                                                                     
+
  $ bin/hadoop
  # $BIN -A INPUT -p tcp -s 206.108.125.175 --dport 8000 -m state --state NEW -j ACCEPT
 
  
Then
+
Standalone operation
 +
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
  
  /etc/init.d/iptables.local restart
+
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
 +
  $ mkdir input 
 +
$ cp conf/*.xml input
 +
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
 +
$ cat output/*
  
 +
cd /usr/local/wormbase/shared/services
 +
curl -O http://mirror.olnevhost.net/pub/apache//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
 +
tar xzf hadoop*
 +
ln -s hadoop-0.20.203 hadoop
 +
 +
sudo apt-get install default-jre
  
----
+
Set up a single node:
  
EVERYTHING BELOW HERE IS DEPRECATED
+
addgroup hadoop
 +
adduser --ingroup hadoop hduser
  
  
=FastCGI=
+
Configure the hduser:
  
==Installing fastcgi ==
+
sudo su hduser
  
 +
Append the following to .bashrc
 
<pre>
 
<pre>
curl -O http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz
+
# Set Hadoop-related environment variables
tar xzf mod_fastcgi*
+
export HADOOP_HOME=/usr/local/wormbase/services/hadoop
cd mod_fastcgi*
+
 
cp Makefile.AP2 Makefile
+
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
make top_dir=/usr/local/apache2
+
export JAVA_HOME=/usr/lib/jvm/java-6-sun
sudo make top_dir=/usr/local/apache2 install
+
 
 +
# Some convenient aliases and functions for running Hadoop-related commands
 +
unalias fs &> /dev/null
 +
alias fs="hadoop fs"
 +
unalias hls &> /dev/null
 +
alias hls="fs -ls"
 +
 
 +
# If you have LZO compression enabled in your Hadoop cluster and
 +
# compress job outputs with LZOP (not covered in this tutorial):
 +
# Conveniently inspect an LZOP compressed file from the command
 +
# line; run via:
 +
#
 +
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
 +
#
 +
# Requires installed 'lzop' command.
 +
#
 +
lzohead () {
 +
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
 +
}
 +
 
 +
# Add Hadoop bin/ directory to PATH
 +
export PATH=$PATH:$HADOOP_HOME/bin
 
</pre>
 
</pre>
  
If you get an error on make saying it can't find special.mk (which is supposed to be distributed with httpd but isn't on CentOS and is not part of httpd-devel, either), try:
+
Create an ssh key:
 +
ssh-keygen -t rsa -P ""
 +
 
 +
Let the hadoop user connect to localhost:
 +
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
 +
chmod 600 $HOST/.ssh/authorized_keys
 +
 
 +
Test
 +
ssh localhost
 +
 
 +
 
 +
Set up hadoop
 +
/usr/local/wormbase/shared/services/hadoop/conf/hadoop-env.sh
 +
Set JAVA_HOME to
 +
JAVA_HOME=/usr/lib/jvm/default-java
 +
 
 +
Configure directory where hadoop stores its files:
 +
conf/core-site.xml
 
<pre>
 
<pre>
sudo apxs -n mod_fastcgi -i -a -c mod_fastcgi.c fcgi_buf.c fcgi_config.c fcgi_pm.c fcgi_protocol.c fcgi_util.c
+
<!-- In: conf/core-site.xml -->
 +
<property>
 +
  <name>hadoop.tmp.dir</name>
 +
  <value>/app/hadoop/tmp</value>
 +
  <description>A base for other temporary directories.</description>
 +
</property>
 +
 
 +
<property>
 +
  <name>fs.default.name</name>
 +
  <value>hdfs://localhost:54310</value>
 +
  <description>The name of the default file system. A URI whose
 +
  scheme and authority determine the FileSystem implementation. The
 +
  uri's scheme determines the config property (fs.SCHEME.impl) naming
 +
  the FileSystem implementation class.  The uri's authority is used to
 +
  determine the host, port, etc. for a filesystem.</description>
 +
</property>
 
</pre>
 
</pre>
  
Add an entry to httpd.conf like this:
+
Create the temporary direcotry:
 +
 
 +
$ sudo mkdir -p /usr/local/hdfs/tmp
 +
$ sudo chown hduser:hadoop /usr/local/hdfs/tmp
 +
# ...and if you want to tighten up security, chmod from 755 to 750...
 +
$ sudo chmod 750 /usr/local/hdfs/tmp
 +
 
 +
In conf/mapred.xml
 +
 
 
<pre>
 
<pre>
  LoadModule fastcgi_module modules/mod_fastcgi.so
+
<!-- In: conf/mapred-site.xml -->
 +
<property>
 +
  <name>mapred.job.tracker</name>
 +
  <value>localhost:54311</value>
 +
  <description>The host and port that the MapReduce job tracker runs
 +
  at. If "local", then jobs are run in-process as a single map
 +
  and reduce task.
 +
  </description>
 +
</property>
 +
</pre>
  
// Note: if you use the apxs command above, it inserts an incorrect line into your httpd.conf file.
+
In conf/hdfs-site.xml
// Edit it to read exactly as above.
+
<pre>
 +
<!-- In: conf/hdfs-site.xml -->
 +
<property>
 +
  <name>dfs.replication</name>
 +
  <value>1</value>
 +
  <description>Default block replication.
 +
  The actual number of replications can be specified when the file is created.
 +
  The default is used if replication is not specified in create time.
 +
  </description>
 +
</property>
 
</pre>
 
</pre>
  
==Launch the fastcgi server==
+
Format the namenode
 +
 
 +
hduser@ubuntu:~$ /usr/local/wormbase/services/hadoop/bin/hadoop namenode -format
 +
 
 +
=== HBase ===
  
<pre>
+
curl -O http://www.reverse.net/pub/apache//hbase/stable/hbase-0.90.3.tar.gz
  // as a socket server in daemon mode
 
  /usr/local/wormbase/website/script/wormbase_fastcgi.pl \
 
      -l /tmp/wormbase.sock -n 5 -p /tmp/wormbase.pid -d
 
  
    // as a deamon bound to a specific port
 
    script/wormbase_fastcgi.pl -l :3001 -n 5 -p /tmp/wormbase.pid -d
 
</pre>
 
  
== Set up the fastcgi server to launch at boot ==
+
=== CouchDB ===
  
Symlink the webapp-fastcgi.init script to /etc/init.d
+
We use the document store CouchDB to store and replicate pregenerated HTML and other content across web server nodes.
  
cd /etc/init.d
+
See the [http://guide.couchdb.org/editions/1/en/index.html CouchDB manual] for a good introduction.
sudo ln -s /usr/local/wormbase/website/util/init/webapp-fastcgi.init wormbase-fastcgi
 
  
Set up symlinks in runlevels:
+
This [http://stackoverflow.com/questions/1911226/using-couchdb-to-serve-html post on StackOverflow] discusses serving HTML via couch, too.
  
cd ../rc3.d
+
Comparing CouchDB and MongoDB? See [http://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 this SlideShare presentation] and [http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB this post] on the MongoDB site.
sudo ln -s ../init.d/wormbase-fastcgi S99wormbase-fastcgi
 
cd ../rc5.d
 
sudo ln -s ../init.d/wormbase-fastcgi S99wormbase-fastcgi
 
  
== Add a cron job that keeps FCGI under control ==
+
==== Installation ====
  
''The following cron job will kill off fcgi children that exceed the specified memory limit (in bytes).
+
Install dependencies
  
  sudo crontab -e
+
  sudo apt-get install erlang libicu-dev libmozjs-dev libcurl4-openssl-dev
*/30 * * * * /usr/local/wormbase/website/util/crons/fastcgi-childreaper.pl \
 
                `cat /tmp/wormbase.pid` 104857600
 
  
= mod_fcgid =
+
Build:
  
 
  cd src
 
  cd src
  tar xzf mod_fcid*
+
curl -O http://www.apache.org/dyn/closer.cgi?path=/couchdb/1.1.0/apache-couchdb-1.1.0.tar.gz
  cd mod_fcgid*
+
  tar xzf apache-*
  APXS=/usr/local/apache2/bin/apxs ./configure.apxs
+
  cd apache-*
  make
+
  ./configure
sudo make install
+
  ./make && sudo make install
  
=Apache=
+
Create a couchdb user if one doesn't already exist
  
==Configure Apache to connect to the fastcgi server==
+
  adduser --system \
''Edit /usr/local/apache2/conf/extra/httpd-vhosts.conf''
+
            --home /usr/local/var/lib/couchdb \
 +
            --no-create-home \
 +
            --shell /bin/bash \
 +
            --group --gecos \
 +
            "CouchDB Administrator" couchdb
  
<pre>
+
Fix permissions
<VirtualHost *:8000>
+
chown -R couchdb:couchdb /usr/local/etc/couchdb
    #    ServerName beta.wormbase.org                                                                                   
+
chown -R couchdb:couchdb /usr/local/var/lib/couchdb
    ErrorLog /usr/local/wormbase/logs/wormbase2.error_log
+
chown -R couchdb:couchdb /usr/local/var/log/couchdb
    TransferLog /usr/local/wormbase/logs/wormbase2.access_log
+
chown -R couchdb:couchdb /usr/local/var/run/couchdb
 +
chmod 0770 /usr/local/etc/couchdb
 +
chmod 0770 /usr/local/var/lib/couchdb
 +
chmod 0770 /usr/local/var/log/couchdb
 +
chmod 0770 /usr/local/var/run/couchdb
  
 +
Set up Couchdb to start up automatically.
 +
cp /usr/local/etc/init.d/couchdb /etc/init.d/couchdb
 +
sudo update-rc.d couchdb defaults
  
    # 502 is a Bad Gateway error, and will occur if the backend server is down
+
Edit the defaults file (/usr/local/etc/couchdb/default.ini) to allow more open permissions
    # This allows us to display a friendly static page that says "down for
+
bin_address 0.0.0.0
    # maintenance"
 
    Alias /_errors /home/todd/projects/wormbase/website/trunk/root/error-pages
 
    ErrorDocument 502 /_errors/502.html
 
  
    # Map dynamic images to the file system
+
Start it up
    # static images are located at img
+
sudo /etc/init.d/couchdb start
    Alias /images      /tmp/wormbase/images/
 
 
   
 
   
  # <Directory /filesystem/path/to/MyApp/root/static>
+
Test it out
  #     allow from all
+
  curl http://127.0.0.1:5984/
  # </Directory>
+
 
  # <Location /myapp/static>
+
Get a list of databases
  #      SetHandler default-handler
+
curl -X GET http://127.0.0.1:5984/_all_dbs
  #  </Location>
+
 
 +
Create a database for each new release of WormBase
 +
curl -X PUT http://127.0.0.1:5984/wsXXX    # We'll create databases for each WSXXX version
 +
 
 +
Deleting databases
 +
  curl -X DELETE http://127.0.0.1:5984/wsXXX
 +
 
 +
Create a document
 +
  curl -X PUT http://127.0.0.1:5984/wsXXX/UUID
 +
 
 +
Get a document
 +
  curl -X GET http://1270.0.0.1:5984/wsXXX/UUID
 +
 
 +
Create a document with an attachment
 +
  curl -X PUT http://127.0.0.1:5984/wsXXX/UUID/attachment
 +
        -d @/usr/local/wormbase/databases/WS226/cache/gene/overview/WBGene00006763.html -H "Content-Type: text/html"
 +
 
 +
Get the document's attachment directly.
 +
  curl -X GET http://127.0.0.1:5984/wsXXX/UUID/attachment
  
    # Static content served directly by Apache
+
==== Securing the database ====
    DocumentRoot /usr/local/wormbase/website/root
+
Add an admin user and password to :
    #    Alias /static /usr/local/wormbase/website-2.0/root
 
  
 +
/usr/local/etc/couchdb/local.ini
  
 +
The password will be hashed automatically.
  
    # Approach 1: Running as a static server (Apache handles spawning of the webapp)     
+
==== Populating the database ====
    # <IfModule fastcgi_module>
 
    #    FastCgiServer /usr/local/wormbase/website-2.0/script/wormbase_fastcgi.pl -processes 3                     
 
    #    Alias / /usr/local/wormbase/website-2.0/script/wormbase_fastcgi.pl/
 
    # </IfModule>
 
                                 
 
  
    # Approach 2: External Process (via mod_fcgi ONLY)
+
During the staging process, the precache_widgets.pl script creates then populates a
    <IfModule mod_fastcgi.c>
+
CouchDB instance on the development server. See that script and its related modules for examples.
        # This says to connect to the Catalyst fcgi server running on localhost, port 777
 
        #  FastCgiExternalServer /tmp/myapp.fcgi -host localhost:7777
 
        # Or to use the socket     
 
        FastCgiExternalServer /tmp/wormbase.fcgi -socket /tmp/wormbase.sock
 
  
        # Place the app at root...
+
In the future, we might also want to explore bulk inserts and [http://guide.couchdb.org/editions/1/en/performance.html performance tuning].
        Alias /    /tmp/wormbase.fcgi/
 
 
 
        # ...or somewhere else
 
        Alias /wormbase/ /tmp/wormbase.fcgi/
 
      </IfModule>
 
  
    # fcgid configuration
+
==== Querying the database ====
    #    <IfModule mod_fcgid>
 
    #        # This should point at your myapp/root
 
    #          DocumentRoot /usr/local/wormbase/beta.wormbase.org/root
 
    #        Alias /static /usr/local/wormbase/beta.wormbase.org/root/static
 
    #        <Location /static>
 
    #                  SetHandler default-handler
 
    #          </Location>
 
    #
 
    #        Alias / /usr/local/wormbase/beta.wormbase.org/script/wormbase_fastcgi.pl/
 
    #        AddType application/x-httpd-php .php
 
    #        <Location />
 
    #                  Options ExecCGI
 
    #                  Order allow,deny
 
    #                  Allow from all
 
    #                  AddHandler fcgid-script .pl
 
    #          </Location>
 
    #    </IfModule>
 
  
  </VirtualHost>
+
Get some general information about a database
</pre>
+
curl -X GET http://127.0.0.1:5984/wsxxx/
  
''Edit /usr/local/apache2/conf/httpd.conf''
+
==== Replication ====
  
Add the appropriate '''Listen PORT''' directive.
+
See the [http://guide.couchdb.org/draft/replication.html guide to replication] in the CouchDB book.
  
 +
Replication is handled by the production/steps/replicate_couchdb.pl script.
  
 +
==== Configuration ====
  
 +
Edit /etc/couchdb/default.ini to the following.
  
 +
==== The Futon Management interface ====
  
 +
==== Couchapp ====
  
= Servers =
+
https://github.com/couchapp/couchapp
  
== Configuring iptables ==
 
  
We don't want end users to directly access back end machines. Instead, we want to force them to pass through the caching proxy. For now, we will only allow the reverse proxy access to the server on port 80.
+
Measuring performance:
 +
http://till.klampaeckel.de/blog/archives/16-measuring-couchdb-performance.html
  
We only want the front end proxy (currently at CSHL) to be able to access the machine directly on port 80. See <code>conf/iptables.local</code> in the wormbase-admin module for details.
+
== Data mining nodes ==
  
== The Wiki, Blog, and Forums Server ==
+
The data mining and BLAST/BLAT server replaces the old aceserver. Because it handles requests for the AQL and WB pages, it is configured exactly as web cluster nodes, with the addition of BLAST, BLAT, and ePCR, and the iptable directives added as shown above.
  
The WormBase Blog, the WormBase Wiki, and the Worm Community Forums all rely on third party software.  To make it easy to update this software, each of these components is maintained as a separate virtual host running on its own port.
+
== WormMart node ==
  
=== Iptables ===
+
== Social feature node ==
  
The blog, wiki, and forums are all virtual hostsHowever, we don't want the server to respond on port 80 for requests to the machine's IP.
+
The WormBase Blog, the WormBase Wiki, and the Worm Community Forums all rely on third party softwareTo make it easy to update this software, each of these components is maintained as a separate name-based virtual host running on the same server: wb-social.oicr.on.ca.
  
 
=== The WormBase Blog ===
 
=== The WormBase Blog ===
Line 495: Line 587:
 
''The WormBase blog is a subdomain of wormbase.org: blog.wormbase.org.  If it's moved, the DNS entry *must* be updated!''
 
''The WormBase blog is a subdomain of wormbase.org: blog.wormbase.org.  If it's moved, the DNS entry *must* be updated!''
  
       Host/Port : wb-acedb1.oicr.on.ca:80
+
       Host/Port : wb-social.oicr.on.ca:80
 
       Alias: blog.wormbase.org
 
       Alias: blog.wormbase.org
 
  MySQL database : wormbase_wordpress_blog
 
  MySQL database : wormbase_wordpress_blog
Line 533: Line 625:
 
''The WormBase Wiki is a subdirectory of the primary WormBase domain.  If it's moved, the proxy that sits in front of it must be updated!''
 
''The WormBase Wiki is a subdirectory of the primary WormBase domain.  If it's moved, the proxy that sits in front of it must be updated!''
  
       Host/Port : wb-acedb1.oicr.on.ca:80
+
       Host/Port : wb-social.oicr.on.ca:80
 
       Alias: wiki.wormbase.org
 
       Alias: wiki.wormbase.org
 
  MySQL database : wormbase_wiki
 
  MySQL database : wormbase_wiki
Line 568: Line 660:
 
''The WormBase Wiki is a subdirectory of the primary WormBase domain.  If it's moved, the proxy that sits in front of it must be updated!''
 
''The WormBase Wiki is a subdirectory of the primary WormBase domain.  If it's moved, the proxy that sits in front of it must be updated!''
  
       Host/Port : wb-acedb1.oicr.on.ca:80
+
       Host/Port : wb-social.oicr.on.ca:80
 
       Alias: forums.wormbase.org
 
       Alias: forums.wormbase.org
 
  MySQL database : wormbaseforumssmf
 
  MySQL database : wormbaseforumssmf
Line 603: Line 695:
 
Note: If the forum is moved, it is also necessary to update Settings.php and the paths to the Sources and Themes directories in the forum Administration Panel > Configuration > Server Settings.
 
Note: If the forum is moved, it is also necessary to update Settings.php and the paths to the Sources and Themes directories in the forum Administration Panel > Configuration > Server Settings.
  
== The Datamining and BLAST server ==
+
= NFS =  
 +
 
 +
We use NFS at WormBase to consolidate logs, simplify maintainence of temporary and static files, and provide access to common (file-system) databases.
 +
 
 +
NFS is served from wb-web1.  '''NOTE: This may be a I/O performance bottleneck - we may need a separate machine'''
 +
 
 +
  /usr/local/wormbase/shared
 +
 
 +
This directory will be mounted at the same path for each client.
 +
 
 +
Contents:
 +
 
 +
logs/          // Server logs
 +
databases  // File-based databases for various searches
 +
tmp            // tmp images
 +
static          // static files
 +
website-shared-files // images and other shared files
 +
 
 +
Note that logs and databases are symlinks in each production node:
 +
 
 +
  /usr/local/wormbase/databases -> shared/databases
 +
 
 +
''' Install NFS'''
 +
 
 +
On Debian, the NFS server needs:
 +
 +
sudo apt-get install nfs-kernel-server nfs-common portmap
 +
 
 +
''' Specify what to share, to whom, and with which privileges '''
 +
 
 +
Share mounts are configured through /etc/exports
 +
 
 +
# WormBase NFS for temporary and static content
 +
/usr/local/wormbase/shared 206.108.125.177(rw,no_root_squash) 206.108.125.190(rw,no_root_squash) 206.108.125.168(rw,no_root_squash) 206.108.125.191(rw,no_root_squash)
 +
 
 +
After making changes to /etc/exports, load them by:
 +
 
 +
sudo exportfs -a
 +
 
 +
''' Lock down access to NFS from most hosts '''
 +
 
 +
We can use /etc/hosts.deny to quickly lock down NFS:
 +
 
 +
portmap:ALL
 +
lockd:ALL
 +
mountd:ALL
 +
rquotad:ALL
 +
statd:ALL
 +
 
 +
''' Allow access to select hosts '''
 +
 
 +
We can use /etc/hosts.allow to specifically allow access to NFS:
 +
 
 +
# WormBase NFS services
 +
portmap: ***.***.***.***
 +
lockd: ***.***.***.***
 +
rquotad: ***.***.***.***
 +
mountd: ***.***.***.***
 +
statd: ***.***.***.***
 +
 
 +
== Configure NFS clients ==
 +
 
 +
''' Install NFS '''
 +
 
 +
On Debian, NFS clients require:
 +
 +
sudo apt-get install nfs-common portmap
 +
 
 +
''' Setting up NFS share to mount at boot '''
 +
 
 +
sudo emacs /etc/fstab
 +
 
 +
Add
 +
 
 +
wb-web1.oicr.on.ca:/usr/local/wormbase/shared /usr/local/wormbase/shared nfs rw,rsize=32768,wsize=32768,intr,noatime  0 0
 +
 
 +
''' Manually mount the NFS share '''
 +
 
 +
sudo mount ${NFS_SERVER}:/usr/local/wormbase/shared /usr/local/wormbase/shared    // The shared dir must already exist
 +
or using the entry in fstab:
 +
sudo mount /usr/local/wormbase/shared
 +
 
 +
''' Unmounting the NFS share '''
 +
 
 +
sudo umount /usr/local/wormbase/shared
 +
 
 +
= Miscellaneous =
 +
 
 +
== Build the user preferences database ==
 +
 
 +
''The website uses a mysql backend to store user preferences, browsing history, session data.  This shouldn't ever need to be recreated (at least until we have a migration path in place from an old database to a new one!), but here's how to create it for reference.  For now, this database is hosted on  the same server providing the reverse proxy.''
 +
 
 +
mysql -u root -p < /usr/local/wormbase/website/production/util/user_login.sql
 +
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@localhost';
 +
 
 +
# All nodes currently use the same session database.
 +
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web1.oicr.on.ca';
 +
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web2.oicr.on.ca';
 +
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web3.oicr.on.ca';
 +
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web4.oicr.on.ca';
 +
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-mining.oicr.on.ca';
 +
 
 +
'''Q: How/Where do I configure the location of the wormbase_user database in the application?'''
 +
 
 +
== Logs ==
 +
 
 +
All relevant logs can be found at:
 +
 
 +
ls /usr/local/wormbase/logs
 +
nginx-error.log    // The reverse proxy error log
 +
nginx-access.log  // The reverse proxy access log
 +
nginx-cache.log    // The reverse proxy cache log
 +
catalyst_error.log // The catalyst error log
 +
 
 +
= See Also =
 +
 
 +
Monitoring
 +
 
 +
Logs
 +
 
 +
Google
 +
 
 +
 
 +
 
 +
----
 +
 
 +
EVERYTHING BELOW HERE IS DEPRECATED
 +
 
 +
= Monitoring =
 +
 
 +
See the monitoring services document? Nagios requires apache and fcgi
 +
 
 +
Should I preserve the fastcgi,fcgi configuration just in case?
 +
 
 +
==FastCGI, FCGI, Apache, and mod_perl ==
 +
 
 +
Originally, WormBase ran under apache + mod_perl.
 +
 
 +
We also experimented with fcgi and fcgid +apache.
 +
 
 +
 
 +
==Installing fastcgi ==
 +
 
 +
<pre>
 +
curl -O http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz
 +
tar xzf mod_fastcgi*
 +
cd mod_fastcgi*
 +
cp Makefile.AP2 Makefile
 +
make top_dir=/usr/local/apache2
 +
sudo make top_dir=/usr/local/apache2 install
 +
</pre>
 +
 
 +
If you get an error on make saying it can't find special.mk (which is supposed to be distributed with httpd but isn't on CentOS and is not part of httpd-devel, either), try:
 +
<pre>
 +
sudo apxs -n mod_fastcgi -i -a -c mod_fastcgi.c fcgi_buf.c fcgi_config.c fcgi_pm.c fcgi_protocol.c fcgi_util.c
 +
</pre>
 +
 
 +
Add an entry to httpd.conf like this:
 +
<pre>
 +
LoadModule fastcgi_module modules/mod_fastcgi.so
 +
 
 +
// Note: if you use the apxs command above, it inserts an incorrect line into your httpd.conf file.
 +
// Edit it to read exactly as above.
 +
</pre>
 +
 
 +
==Launch the fastcgi server==
 +
 
 +
<pre>
 +
  // as a socket server in daemon mode
 +
  /usr/local/wormbase/website/script/wormbase_fastcgi.pl \
 +
      -l /tmp/wormbase.sock -n 5 -p /tmp/wormbase.pid -d
 +
 
 +
    // as a deamon bound to a specific port
 +
    script/wormbase_fastcgi.pl -l :3001 -n 5 -p /tmp/wormbase.pid -d
 +
</pre>
 +
 
 +
== Set up the fastcgi server to launch at boot ==
 +
 
 +
Symlink the webapp-fastcgi.init script to /etc/init.d
 +
 
 +
cd /etc/init.d
 +
sudo ln -s /usr/local/wormbase/website/util/init/webapp-fastcgi.init wormbase-fastcgi
 +
 
 +
Set up symlinks in runlevels:
 +
 
 +
cd ../rc3.d
 +
sudo ln -s ../init.d/wormbase-fastcgi S99wormbase-fastcgi
 +
cd ../rc5.d
 +
sudo ln -s ../init.d/wormbase-fastcgi S99wormbase-fastcgi
 +
 
 +
== Add a cron job that keeps FCGI under control ==
 +
 
 +
''The following cron job will kill off fcgi children that exceed the specified memory limit (in bytes).
 +
 
 +
sudo crontab -e
 +
*/30 * * * * /usr/local/wormbase/website/util/crons/fastcgi-childreaper.pl \
 +
                `cat /tmp/wormbase.pid` 104857600
 +
 
 +
== mod_fcgid ==
 +
 
 +
mod_fcgid is an alternative to fcgi
 +
 
 +
cd src/
 +
wget http://www.carfab.com/apachesoftware/httpd/mod_fcgid/mod_fcgid-2.3.5.tar.gz
 +
tar xzf mod_fcgid-2.3.5.tar.gz
 +
cd mod_fcgid-2.3.5 
 +
APXS=/usr/local/apache2/bin/apxs ./configure.apxs 
 +
make
 +
sudo make install
 +
 
 +
=Apache=
 +
 
 +
==Configure Apache to connect to the fastcgi server==
 +
''Edit /usr/local/apache2/conf/extra/httpd-vhosts.conf''
 +
 
 +
<pre>
 +
<VirtualHost *:8000>
 +
    #    ServerName beta.wormbase.org                                                                                   
 +
    ErrorLog /usr/local/wormbase/logs/wormbase2.error_log
 +
    TransferLog /usr/local/wormbase/logs/wormbase2.access_log
 +
 
 +
 
 +
    # 502 is a Bad Gateway error, and will occur if the backend server is down
 +
    # This allows us to display a friendly static page that says "down for
 +
    # maintenance"
 +
    Alias /_errors /home/todd/projects/wormbase/website/trunk/root/error-pages
 +
    ErrorDocument 502 /_errors/502.html
 +
 
 +
    # Map dynamic images to the file system
 +
    # static images are located at img
 +
    Alias /images      /tmp/wormbase/images/
 +
 +
  #  <Directory /filesystem/path/to/MyApp/root/static>
 +
  #      allow from all
 +
  #  </Directory>
 +
  #  <Location /myapp/static>
 +
  #      SetHandler default-handler
 +
  #  </Location>
 +
 
 +
    # Static content served directly by Apache
 +
    DocumentRoot /usr/local/wormbase/website/root
 +
    #    Alias /static /usr/local/wormbase/website-2.0/root
 +
 
 +
 
 +
 
 +
    # Approach 1: Running as a static server (Apache handles spawning of the webapp)     
 +
    # <IfModule fastcgi_module>
 +
    #    FastCgiServer /usr/local/wormbase/website-2.0/script/wormbase_fastcgi.pl -processes 3                     
 +
    #    Alias / /usr/local/wormbase/website-2.0/script/wormbase_fastcgi.pl/
 +
    # </IfModule>
 +
                                 
 +
 
 +
    # Approach 2: External Process (via mod_fcgi ONLY)
 +
    <IfModule mod_fastcgi.c>
 +
        # This says to connect to the Catalyst fcgi server running on localhost, port 777
 +
        #  FastCgiExternalServer /tmp/myapp.fcgi -host localhost:7777
 +
        # Or to use the socket     
 +
        FastCgiExternalServer /tmp/wormbase.fcgi -socket /tmp/wormbase.sock
 +
 
 +
        # Place the app at root...
 +
        Alias /    /tmp/wormbase.fcgi/
 +
 
 +
        # ...or somewhere else
 +
        Alias /wormbase/ /tmp/wormbase.fcgi/
 +
      </IfModule>
 +
 
 +
    # fcgid configuration
 +
    #    <IfModule mod_fcgid>
 +
    #        # This should point at your myapp/root
 +
    #          DocumentRoot /usr/local/wormbase/beta.wormbase.org/root
 +
    #        Alias /static /usr/local/wormbase/beta.wormbase.org/root/static
 +
    #        <Location /static>
 +
    #                  SetHandler default-handler
 +
    #          </Location>
 +
    #
 +
    #        Alias / /usr/local/wormbase/beta.wormbase.org/script/wormbase_fastcgi.pl/
 +
    #        AddType application/x-httpd-php .php
 +
    #        <Location />
 +
    #                  Options ExecCGI
 +
    #                  Order allow,deny
 +
    #                  Allow from all
 +
    #                  AddHandler fcgid-script .pl
 +
    #          </Location>
 +
    #    </IfModule>
 +
 
 +
  </VirtualHost>
 +
</pre>
 +
 
 +
''Edit /usr/local/apache2/conf/httpd.conf''
 +
 
 +
Add the appropriate '''Listen PORT''' directive.
  
The data mining and BLAST/BLAT server replaces the old aceserver. Because it handles requests for the AQL and WB pages, it runs the full website and has all mysql and acedb databases.
+
[[Category: Legacy Architecture (Web Dev)]]

Latest revision as of 20:48, 18 June 2014

DEPRECATED!


Overview

The WormBase production environment consists of a series of partially redundant web and database servers, most sitting behind a load-balancing reverse-proxy server running nginx. This document describes the basic setup and configuration of this environment.

Reverse proxy node

Two servers each running nginx as a load-balancing reverse proxy. Built in memcached support establishes a memory cache amongst all back end web server nodes. Requests are distributed in round-robin fashion.

nginx acedb starman_webapp mysql userdb

Web server nodes

Each web cluster node runs the lightweight HTTP server starman listening on port 5000. This http server is glued via PSGI/Plack/Starman to our Catalyst web application.
Currently, each node is -- with the exception of GBrowse -- almost entirely independent, with its own AceDB and MySQL databases.
Web cluster nodes are accessible ONLY to the front end proxy.

Data mining nodes

Social feature node

To resolve

  • How/where is the back end node hosting the user database specified?
  • differences in configuration files.

nginx

  • ssl
  • proxy caching
  • to test
  • logging in
  • browser compatibility
  • set automatic updates of code and restarting of services

Nodes

Reverse Proxy Node

nginx

Installation

We'll place nginx entirely within the wormbase root directory. It's configuration and init files are maintained in the wormbase-admin module.

1. Install prerequisites

  # Perl Compatabile Regular Expression library, or for EC2, yum
  sudo apt-get install libpcre3 libpcre3-dev libssl-dev libc6-dev 
  # Fetch and unpack openssl
 wget http://www.openssl.org/source/openssl-0.9.8p.tar.gz
 tar -zxf openssl-0.9.8p.tar.gz

2. Get the nginx cache-purge module

  cd src/
  curl -O http://labs.frickle.com/files/ngx_cache_purge-1.3.tar.gz
  tar xzf ngx_cache_purge-1.3.tar.gz

3. Build and install nginx

  curl -O http://nginx.org/download/nginx-1.0.14.tar.gz
  tar xzf nginx*
  cd nginx*
  ./configure \
   --prefix=/usr/local/wormbase/services/nginx-1.0.14 \
   --error-log-path=/usr/local/wormbase/logs/nginx-error.log \
   --http-log-path=/usr/local/wormbase/logs/nginx-access.log \
   --with-http_stub_status_module \
   --with-http_ssl_module \
   --with-ipv6 \
   --with-http_realip_module \
    --with-http_addition_module \
    --with-http_image_filter_module \
    --with-http_sub_module \
    --with-http_dav_module \
    --with-http_flv_module \
    --with-http_gzip_static_module \
    --with-http_secure_link_module \
     --with-openssl=../openssl-0.9.8p \
     --add-module=../ngx_cache_purge-1.3
   make
   make install
   cd /usr/local/wormbase/services
   ln -s nginx-1.0.14 nginx
   cd /usr/local/wormbase/services/nginx
   mv conf conf.original
   ln -s /usr/local/wormbase/website-admin/nginx/production conf

4. Test the configuration file syntax by:

$ nginx -t

Here's a more complicated example demonstrating caching and load balancing: http://nathanvangheem.com/news/nginx-with-built-in-load-balancing-and-caching

About Load Balancing

nginx relies on the NginxHttpUpstreamModule for load balancing. It's built-in by default. The documentation contains a number of possibly useful configuration directives:

 http://wiki.nginx.org/NginxHttpUpstreamModule

There are a number of other interesting load-balancing modules that might be of use:

 http://wiki.nginx.org/3rdPartyModules


5. Generate SSL certificates

To generate private (dummy) certificates you can perform the following list of openssl commands.

First change directory to where you want to create the certificate and private key:

$ cd /usr/local/wormbase/shared/services/nginx/conf
Now create the server private key, you'll be asked for a passphrase:
$ openssl genrsa -des3 -out server.key 1024
Create the Certificate Signing Request (CSR):
$ openssl req -new -key server.key -out server.csr
Remove the necessity of entering a passphrase for starting up nginx with SSL using the above private key:
$ cp server.key server.key.org
$ openssl rsa -in server.key.org -out server.key
Finally sign the certificate using the above private key and CSR:
$ openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt
Update Nginx configuration to point to the newly signed certificate and private key.

Set nginx to start at server launch

sudo cp /usr/local/wormbase/website-admin/nginx/conf/production/nginx.init /etc/init.d/nginx
 sudo /usr/sbin/update-rc.d -f nginx defaults

The output will be similar to this:

Adding system startup for /etc/init.d/nginx ...
  /etc/rc0.d/K20nginx -> ../init.d/nginx
  /etc/rc1.d/K20nginx -> ../init.d/nginx
  /etc/rc6.d/K20nginx -> ../init.d/nginx
  /etc/rc2.d/S20nginx -> ../init.d/nginx
  /etc/rc3.d/S20nginx -> ../init.d/nginx
  /etc/rc4.d/S20nginx -> ../init.d/nginx
  /etc/rc5.d/S20nginx -> ../init.d/nginx


Set up nginx log roation

30 1 * * * /usr/local/wormbase/website-admin/log_analysis/rotate_nginx_logs.pl

Starting the Server

sudo /etc/init.d/nginx restart

memcached

Install

$ sudo apt-get install memcached

Configure

Make memcached listen to all IP addresses, not just requests from localhost:'

$ emacs /etc/memcached.conf
-l 127.0.0.1   <--- comment out this line

Lock down access to memcached via iptables as described below.

memcached installed    memcached configed     iptables config
web1          
web2 configured
web3
web4
mining

Configure iptables

   ##################################################         
   # New architecture 
   #                                                                 
   # Proxy nodes:
   #    service   port                 accessibility 
   #    nginx     2011, later on 80    all                                                            
   #    memcached 11211                cluster
   #    starman   5000                 localhost
   #    mysql     3306                 cluster, dev                                                
   #   
   # Backend nodes                                                                         
   #    service   port                 accessibility                                                        
   #    starman   5000                 proxy nodes                                             
   #    memcached 11211                cluster                                     
   #                                                              
   ##################################################
   # Proxy
   # The new website front-end proxy, accessible to the world                                                          
   $BIN -A INPUT -p tcp --dport 2011 -m state --state NEW -j ACCEPT
   # Open MySQL to other production nodes; need access to the sessions database                                      
   $BIN -A INPUT -p tcp --dport 3306 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
   # Open memcached
   $BIN -A INPUT -p tcp --dport 11211 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
   # Backend machines
   # starman
   $BIN -A INPUT -p tcp --dport 5000 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
   # memcached
   $BIN -A INPUT -p tcp --dport 11211 -m iprange --src-range 206.108.125.168-206.108.125.190 -j ACCEPT
   # Let me access backend services directly for debugging                                                                                   
   # Starman, port 5000                                                                                                                     
   $BIN -A INPUT -p tcp -s 206.228.142.230 --dport 5000 -m state --state NEW -j ACCEPT
   # Old site httpd, port 8080                                                                                                              
   $BIN -A INPUT -p tcp -s 206.228.142.230 --dport 8080 -m state --state NEW -j ACCEPT
   # memcached                                                                                                                              
   $BIN -A INPUT -p tcp -s 206.228.142.230 --dport 11211 -m state --state NEW -j ACCEPT

Then

/etc/init.d/iptables.local restart

Launch services on the front end machine

# nginx
$ sudo /etc/init.d/nginx restart
# memcached
$ sudo /etc/init.d/memcached restart
# starman
cd /usr/local/wormbase/website/production/bin
$ ./starman-production.sh start

Webserver Nodes

Webserver nodes mount an NFS share containing (almost) everything they need.

A webserver node expects the following layout:

/usr/local/wormbase
/usr/local/wormbase/acedb
/usr/local/wormbase/databases 
/usr/local/wormbase/extlib
/usr/local/wormbase/services
/usr/local/wormbase/website/production
/usr/local/wormbase/website-shared-files


An NFS share provides most of these elements, mounted at /usr/local/wormbase/shared, with symlinks as:

/usr/local/wormbase

/usr/local/wormbase/acedb
/usr/local/wormbase/databases -> shared/databases
/usr/local/wormbase/extlib -> shared/extlib
/usr/local/wormbase/services -> shared/services
/usr/local/wormbase/website/production -> shared/website/production
/usr/local/wormbase/website-shared-files -> shared/website-shared-files

Each webserver node hosts its own AceDB database and server, its own mysql database

Individual webserver nodes should be configured essentially as described in the Installing WormBase documentation, except that they do not require nginx.

HTTP server: PSGI/Plack + Starman

See Starman: the lightweight http server section in the Installing WormBase documentation.

Memached/libmemcached

See above for details.

The Webapp

The web app and all Perl libraries will be installed automatically by the deploy_wormbase_webapp.sh script.

 /usr/local/wormbase/website/production -> WSXXXX-YYYY.MM.DD-X.XX-XXXX
 /usr/local/wormbase/website/WSXXX-YYYY.MM.DD-X.XX-XXXX

For details on installation of the web app itself, see the Install The Webapp section of the main Installing WormBase guide.


Launch services on back end machines

# memcached
$ sudo /etc/init.d/memcached restart

# starman
cd /usr/local/wormbase/website/production/bin
./starman-production.sh start


Hadoop Distributed File System (HDFS+Hoop)

We use the Hadoop Distributed File System to make it easier and faster to move files around.

Documentation on setting up Hadoop:

http://hadoop.apache.org/common/docs/current/single_node_setup.html

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

Install HDFS on each node.

http://mirror.olnevhost.net/pub/apache//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz


Start the cluster:

$ bin/hadoop

Standalone operation By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.

The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

$ mkdir input  
$ cp conf/*.xml input 
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' 
$ cat output/*
cd /usr/local/wormbase/shared/services
curl -O http://mirror.olnevhost.net/pub/apache//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
tar xzf hadoop*
ln -s hadoop-0.20.203 hadoop

sudo apt-get install default-jre

Set up a single node:

addgroup hadoop
adduser --ingroup hadoop hduser


Configure the hduser:

sudo su hduser

Append the following to .bashrc

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/wormbase/services/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-6-sun

# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"

# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
}

# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

Create an ssh key:

ssh-keygen -t rsa -P ""

Let the hadoop user connect to localhost:

cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
chmod 600 $HOST/.ssh/authorized_keys

Test

ssh localhost


Set up hadoop

/usr/local/wormbase/shared/services/hadoop/conf/hadoop-env.sh

Set JAVA_HOME to

JAVA_HOME=/usr/lib/jvm/default-java

Configure directory where hadoop stores its files:

conf/core-site.xml
<!-- In: conf/core-site.xml -->
<property>
  <name>hadoop.tmp.dir</name>
  <value>/app/hadoop/tmp</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

Create the temporary direcotry:

$ sudo mkdir -p /usr/local/hdfs/tmp $ sudo chown hduser:hadoop /usr/local/hdfs/tmp

  1. ...and if you want to tighten up security, chmod from 755 to 750...

$ sudo chmod 750 /usr/local/hdfs/tmp

In conf/mapred.xml

<!-- In: conf/mapred-site.xml -->
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

In conf/hdfs-site.xml

<!-- In: conf/hdfs-site.xml -->
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

Format the namenode

hduser@ubuntu:~$ /usr/local/wormbase/services/hadoop/bin/hadoop namenode -format

HBase

curl -O http://www.reverse.net/pub/apache//hbase/stable/hbase-0.90.3.tar.gz


CouchDB

We use the document store CouchDB to store and replicate pregenerated HTML and other content across web server nodes.

See the CouchDB manual for a good introduction.

This post on StackOverflow discusses serving HTML via couch, too.

Comparing CouchDB and MongoDB? See this SlideShare presentation and this post on the MongoDB site.

Installation

Install dependencies

sudo apt-get install erlang libicu-dev libmozjs-dev libcurl4-openssl-dev

Build:

cd src
curl -O http://www.apache.org/dyn/closer.cgi?path=/couchdb/1.1.0/apache-couchdb-1.1.0.tar.gz
tar xzf apache-*
cd apache-*
./configure
./make && sudo make install

Create a couchdb user if one doesn't already exist

  adduser --system \
           --home /usr/local/var/lib/couchdb \
           --no-create-home \
           --shell /bin/bash \
           --group --gecos \
           "CouchDB Administrator" couchdb

Fix permissions

chown -R couchdb:couchdb /usr/local/etc/couchdb
chown -R couchdb:couchdb /usr/local/var/lib/couchdb
chown -R couchdb:couchdb /usr/local/var/log/couchdb
chown -R couchdb:couchdb /usr/local/var/run/couchdb
chmod 0770 /usr/local/etc/couchdb
chmod 0770 /usr/local/var/lib/couchdb
chmod 0770 /usr/local/var/log/couchdb
chmod 0770 /usr/local/var/run/couchdb

Set up Couchdb to start up automatically.

cp /usr/local/etc/init.d/couchdb /etc/init.d/couchdb
sudo update-rc.d couchdb defaults

Edit the defaults file (/usr/local/etc/couchdb/default.ini) to allow more open permissions

bin_address 0.0.0.0

Start it up

sudo /etc/init.d/couchdb start

Test it out

curl http://127.0.0.1:5984/

Get a list of databases

curl -X GET http://127.0.0.1:5984/_all_dbs

Create a database for each new release of WormBase

curl -X PUT http://127.0.0.1:5984/wsXXX    # We'll create databases for each WSXXX version

Deleting databases

curl -X DELETE http://127.0.0.1:5984/wsXXX

Create a document

  curl -X PUT http://127.0.0.1:5984/wsXXX/UUID

Get a document

curl -X GET http://1270.0.0.1:5984/wsXXX/UUID

Create a document with an attachment

  curl -X PUT http://127.0.0.1:5984/wsXXX/UUID/attachment
       -d @/usr/local/wormbase/databases/WS226/cache/gene/overview/WBGene00006763.html -H "Content-Type: text/html"

Get the document's attachment directly.

  curl -X GET http://127.0.0.1:5984/wsXXX/UUID/attachment

Securing the database

Add an admin user and password to :

/usr/local/etc/couchdb/local.ini

The password will be hashed automatically.

Populating the database

During the staging process, the precache_widgets.pl script creates then populates a CouchDB instance on the development server. See that script and its related modules for examples.

In the future, we might also want to explore bulk inserts and performance tuning.

Querying the database

Get some general information about a database

curl -X GET http://127.0.0.1:5984/wsxxx/

Replication

See the guide to replication in the CouchDB book.

Replication is handled by the production/steps/replicate_couchdb.pl script.

Configuration

Edit /etc/couchdb/default.ini to the following.

The Futon Management interface

Couchapp

https://github.com/couchapp/couchapp


Measuring performance: http://till.klampaeckel.de/blog/archives/16-measuring-couchdb-performance.html

Data mining nodes

The data mining and BLAST/BLAT server replaces the old aceserver. Because it handles requests for the AQL and WB pages, it is configured exactly as web cluster nodes, with the addition of BLAST, BLAT, and ePCR, and the iptable directives added as shown above.

WormMart node

Social feature node

The WormBase Blog, the WormBase Wiki, and the Worm Community Forums all rely on third party software. To make it easy to update this software, each of these components is maintained as a separate name-based virtual host running on the same server: wb-social.oicr.on.ca.

The WormBase Blog

The WormBase blog is a subdomain of wormbase.org: blog.wormbase.org. If it's moved, the DNS entry *must* be updated!

     Host/Port : wb-social.oicr.on.ca:80
     Alias: blog.wormbase.org
MySQL database : wormbase_wordpress_blog
 Document root : /usr/local/wormbase/website-blog/current
        Logs   : /usr/local/wormbase/blogs-access_log, /usr/local/wormbase/logs/blogs-error_log

Blog files are stored in /usr/local/wormbase/website-blog/current:

 current -> wordpress-2.92

Add the following apache configuration to /usr/local/apache2/conf/extras/httpd-vhosts.conf

<VirtualHost *:80>
   ServerName blog.wormbase.org
   DocumentRoot /usr/local/wormbase/website-blog

    <Directory "/usr/local/wormbase/website-blog">
       DirectoryIndex index.php index.html
       AddType application/x-httpd-php .php
       Order Deny,Allow
       Allow from all
   </Directory>

   LogFormat "%h %l %u %t \"%r\" %s %b" common
   LogFormat "%h %l %u %t %{Referer}i \"%{User-Agent}i\" \"%r\" %s %b" combined_format
   LogFormat "witheld %l %u %t \"%r\" %s %b" anonymous

   ErrorLog     /usr/local/wormbase/logs/blog-error_log
   CustomLog    /usr/local/wormbase/logs/blog-access_log combined_format
</VirtualHost>

NOTE: when upgrading, be sure to copy the wp-config.php file and entire wp-content/ directory.

The WormBase Wiki

The WormBase Wiki is a subdirectory of the primary WormBase domain. If it's moved, the proxy that sits in front of it must be updated!

     Host/Port : wb-social.oicr.on.ca:80
     Alias: wiki.wormbase.org
MySQL database : wormbase_wiki
 Document root : /usr/local/wormbase/website-wiki/current
        Logs   : /usr/local/wormbase/wiki-access_log, /usr/local/wormbase/logs/wiki-error_log

Add the following apache configuration to /usr/local/apache2/conf/extras/httpd-vhosts.conf

<VirtualHost *:80>
   ServerName wiki.wormbase.org

   # Current is a symlink to the current installation.
   DocumentRoot /usr/local/wormbase/website-wiki/current

    <Directory "/usr/local/wormbase/website-wiki/current">
       DirectoryIndex index.php index.html
       AddType application/x-httpd-php .php
       Order Deny,Allow
       Allow from all
   </Directory>

   LogFormat "%h %l %u %t \"%r\" %s %b" common
   LogFormat "%h %l %u %t %{Referer}i \"%{User-Agent}i\" \"%r\" %s %b" combined_format
   LogFormat "witheld %l %u %t \"%r\" %s %b" anonymous

   ErrorLog     /usr/local/wormbase/logs/wiki-error_log
   CustomLog    /usr/local/wormbase/logs/wiki-access_log combined_format
</VirtualHost>

The Worm Community Forums

The WormBase Wiki is a subdirectory of the primary WormBase domain. If it's moved, the proxy that sits in front of it must be updated!

     Host/Port : wb-social.oicr.on.ca:80
     Alias: forums.wormbase.org
MySQL database : wormbaseforumssmf
 Document root : /usr/local/wormbase/website-forums
        Logs   : /usr/local/wormbase/forums-access_log, /usr/local/wormbase/logs/forums-error_log

Add the following apache configuration to /usr/local/apache2/conf/extras/httpd-vhosts.conf

<VirtualHost *:80>
   ServerName forums.wormbase.org
   # Current is a symlink to the current version of SMF
   DocumentRoot /usr/local/wormbase/website-forums/current

    <Directory "/usr/local/wormbase/website-forums/current">
       DirectoryIndex index.php index.html
       AddType application/x-httpd-php .php
       Order Deny,Allow
       Allow from all
   </Directory>

   LogFormat "%h %l %u %t \"%r\" %s %b" common
   LogFormat "%h %l %u %t %{Referer}i \"%{User-Agent}i\" \"%r\" %s %b" combined_format
   LogFormat "witheld %l %u %t \"%r\" %s %b" anonymous

   ErrorLog     /usr/local/wormbase/logs/forums-error_log
   CustomLog    /usr/local/wormbase/logs/forums-access_log combined_format

</VirtualHost>

Add "Listen 8081" to the primary httpd.conf file.

Note: If the forum is moved, it is also necessary to update Settings.php and the paths to the Sources and Themes directories in the forum Administration Panel > Configuration > Server Settings.

NFS

We use NFS at WormBase to consolidate logs, simplify maintainence of temporary and static files, and provide access to common (file-system) databases.

NFS is served from wb-web1. NOTE: This may be a I/O performance bottleneck - we may need a separate machine

 /usr/local/wormbase/shared

This directory will be mounted at the same path for each client.

Contents:

logs/          // Server logs
databases   // File-based databases for various searches
tmp            // tmp images
static          // static files
website-shared-files // images and other shared files

Note that logs and databases are symlinks in each production node:

 /usr/local/wormbase/databases -> shared/databases

Install NFS

On Debian, the NFS server needs:

sudo apt-get install nfs-kernel-server nfs-common portmap

Specify what to share, to whom, and with which privileges

Share mounts are configured through /etc/exports

# WormBase NFS for temporary and static content
/usr/local/wormbase/shared 206.108.125.177(rw,no_root_squash) 206.108.125.190(rw,no_root_squash) 206.108.125.168(rw,no_root_squash) 206.108.125.191(rw,no_root_squash)

After making changes to /etc/exports, load them by:

sudo exportfs -a

Lock down access to NFS from most hosts

We can use /etc/hosts.deny to quickly lock down NFS:

portmap:ALL
lockd:ALL
mountd:ALL
rquotad:ALL
statd:ALL

Allow access to select hosts

We can use /etc/hosts.allow to specifically allow access to NFS:

# WormBase NFS services 
portmap: ***.***.***.***
lockd: ***.***.***.***
rquotad: ***.***.***.***
mountd: ***.***.***.***
statd: ***.***.***.***

Configure NFS clients

Install NFS

On Debian, NFS clients require:

sudo apt-get install nfs-common portmap

Setting up NFS share to mount at boot

sudo emacs /etc/fstab

Add

wb-web1.oicr.on.ca:/usr/local/wormbase/shared /usr/local/wormbase/shared nfs rw,rsize=32768,wsize=32768,intr,noatime   0 0

Manually mount the NFS share

sudo mount ${NFS_SERVER}:/usr/local/wormbase/shared /usr/local/wormbase/shared    // The shared dir must already exist
or using the entry in fstab:
sudo mount /usr/local/wormbase/shared

Unmounting the NFS share

sudo umount /usr/local/wormbase/shared

Miscellaneous

Build the user preferences database

The website uses a mysql backend to store user preferences, browsing history, session data. This shouldn't ever need to be recreated (at least until we have a migration path in place from an old database to a new one!), but here's how to create it for reference. For now, this database is hosted on the same server providing the reverse proxy.

mysql -u root -p < /usr/local/wormbase/website/production/util/user_login.sql
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@localhost';
# All nodes currently use the same session database.
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web1.oicr.on.ca';
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web2.oicr.on.ca';
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web3.oicr.on.ca';
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-web4.oicr.on.ca';
mysql -u root -p -e 'grant all privileges on wormbase_user.* to wb@wb-mining.oicr.on.ca';

Q: How/Where do I configure the location of the wormbase_user database in the application?

Logs

All relevant logs can be found at:

ls /usr/local/wormbase/logs
nginx-error.log    // The reverse proxy error log
nginx-access.log   // The reverse proxy access log
nginx-cache.log    // The reverse proxy cache log
catalyst_error.log // The catalyst error log

See Also

Monitoring

Logs

Google



EVERYTHING BELOW HERE IS DEPRECATED

Monitoring

See the monitoring services document? Nagios requires apache and fcgi

Should I preserve the fastcgi,fcgi configuration just in case?

FastCGI, FCGI, Apache, and mod_perl

Originally, WormBase ran under apache + mod_perl.

We also experimented with fcgi and fcgid +apache.


Installing fastcgi

curl -O http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz
tar xzf mod_fastcgi*
cd mod_fastcgi*
cp Makefile.AP2 Makefile
make top_dir=/usr/local/apache2
sudo make top_dir=/usr/local/apache2 install

If you get an error on make saying it can't find special.mk (which is supposed to be distributed with httpd but isn't on CentOS and is not part of httpd-devel, either), try:

sudo apxs -n mod_fastcgi -i -a -c mod_fastcgi.c fcgi_buf.c fcgi_config.c fcgi_pm.c fcgi_protocol.c fcgi_util.c

Add an entry to httpd.conf like this:

 LoadModule fastcgi_module modules/mod_fastcgi.so

 // Note: if you use the apxs command above, it inserts an incorrect line into your httpd.conf file.
 // Edit it to read exactly as above.

Launch the fastcgi server

   // as a socket server in daemon mode
  /usr/local/wormbase/website/script/wormbase_fastcgi.pl \
       -l /tmp/wormbase.sock -n 5 -p /tmp/wormbase.pid -d

    // as a deamon bound to a specific port
    script/wormbase_fastcgi.pl -l :3001 -n 5 -p /tmp/wormbase.pid -d

Set up the fastcgi server to launch at boot

Symlink the webapp-fastcgi.init script to /etc/init.d

cd /etc/init.d
sudo ln -s /usr/local/wormbase/website/util/init/webapp-fastcgi.init wormbase-fastcgi

Set up symlinks in runlevels:

cd ../rc3.d
sudo ln -s ../init.d/wormbase-fastcgi S99wormbase-fastcgi
cd ../rc5.d
sudo ln -s ../init.d/wormbase-fastcgi S99wormbase-fastcgi

Add a cron job that keeps FCGI under control

The following cron job will kill off fcgi children that exceed the specified memory limit (in bytes).

sudo crontab -e
*/30 * * * * /usr/local/wormbase/website/util/crons/fastcgi-childreaper.pl \
                `cat /tmp/wormbase.pid` 104857600

mod_fcgid

mod_fcgid is an alternative to fcgi

cd src/
wget http://www.carfab.com/apachesoftware/httpd/mod_fcgid/mod_fcgid-2.3.5.tar.gz
tar xzf mod_fcgid-2.3.5.tar.gz 
cd mod_fcgid-2.3.5   
APXS=/usr/local/apache2/bin/apxs ./configure.apxs  
make
sudo make install

Apache

Configure Apache to connect to the fastcgi server

Edit /usr/local/apache2/conf/extra/httpd-vhosts.conf

<VirtualHost *:8000>
     #    ServerName beta.wormbase.org                                                                                     
     ErrorLog /usr/local/wormbase/logs/wormbase2.error_log
     TransferLog /usr/local/wormbase/logs/wormbase2.access_log


     # 502 is a Bad Gateway error, and will occur if the backend server is down
     # This allows us to display a friendly static page that says "down for
     # maintenance"
     Alias /_errors /home/todd/projects/wormbase/website/trunk/root/error-pages
     ErrorDocument 502 /_errors/502.html

     # Map dynamic images to the file system 
     # static images are located at img
     Alias /images       /tmp/wormbase/images/
 
  #  <Directory /filesystem/path/to/MyApp/root/static>
  #      allow from all
  #  </Directory>
  #  <Location /myapp/static>
  #      SetHandler default-handler
  #  </Location>

     # Static content served directly by Apache
     DocumentRoot /usr/local/wormbase/website/root
     #     Alias /static /usr/local/wormbase/website-2.0/root



     # Approach 1: Running as a static server (Apache handles spawning of the webapp)       
     # <IfModule fastcgi_module>
     #    FastCgiServer /usr/local/wormbase/website-2.0/script/wormbase_fastcgi.pl -processes 3                      
     #    Alias / /usr/local/wormbase/website-2.0/script/wormbase_fastcgi.pl/
     # </IfModule>
                                   

     # Approach 2: External Process (via mod_fcgi ONLY)
     <IfModule mod_fastcgi.c>
         # This says to connect to the Catalyst fcgi server running on localhost, port 777
         #  FastCgiExternalServer /tmp/myapp.fcgi -host localhost:7777
         # Or to use the socket      
         FastCgiExternalServer /tmp/wormbase.fcgi -socket /tmp/wormbase.sock

         # Place the app at root...
         Alias /    /tmp/wormbase.fcgi/
  
         # ...or somewhere else
         Alias /wormbase/ /tmp/wormbase.fcgi/
      </IfModule>

     # fcgid configuration
     #     <IfModule mod_fcgid>
     #         # This should point at your myapp/root
     #          DocumentRoot /usr/local/wormbase/beta.wormbase.org/root
     #         Alias /static /usr/local/wormbase/beta.wormbase.org/root/static
     #         <Location /static>
     #                   SetHandler default-handler
     #          </Location>
     #
     #         Alias / /usr/local/wormbase/beta.wormbase.org/script/wormbase_fastcgi.pl/
     #         AddType application/x-httpd-php .php
     #         <Location />
     #                   Options ExecCGI
     #                   Order allow,deny
     #                   Allow from all
     #                   AddHandler fcgid-script .pl
     #          </Location>
     #     </IfModule>

   </VirtualHost>

Edit /usr/local/apache2/conf/httpd.conf

Add the appropriate Listen PORT directive.