Difference between revisions of "How to build a WormBase Virtual Machine"

From WormBaseWiki
Jump to navigationJump to search
Line 38: Line 38:
 
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch.  For the core virtual machine, the directory structure looks like this:
 
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch.  For the core virtual machine, the directory structure looks like this:
  
  wormbase-live-server/wormbase.vmx
+
During build:
  WS180-database/acedb.vmdk
+
 
                            /elegans_gff.vmdk
+
wormbase-live-server/wormbase.vmx
                            /briggsae_gff.vmdk
+
WS180-database/acedb.vmdk
                          /remanei_gff.vmdk
+
                    /elegans_gff.vmdk
                          /autocomplete_gff.vmdk
+
                    /briggsae_gff.vmdk
                          /support.vmdk
+
                    /remanei_gff.vmdk
  current_databases -> WS180-databases
+
                    /autocomplete_gff.vmdk
 +
                    /support.vmdk
 +
current_databases -> WS180-databases
  
 
Thus, the virtual machine expects the databases to be located at :
 
Thus, the virtual machine expects the databases to be located at :

Revision as of 19:19, 3 October 2007

Overview

WormBase Virtual Machines are created for each release of the database. This process is almost entirely scripted, created from base virtual machines that run all the time and are automatically kept up-to-date with the production nodes.

To simplify the download and update process, WormBase virtual machines are split into modules. The primary virtual machine contains all software and configuration running under CentOS 5 (for servers) or Ubuntu 6.06 (for desktops). Databases are maintained as virtual disks (VMDKs).

Creation of a new VM requires three steps:

1. Syncing the software to the staging rsync module hosted on the main WormBase development site.

2. Creation of VMDKs for available databases.

3. Tarring and gzipping.

The process is described in more detail below.

Base Virtual Machines

I maintain two base virtual machines.

wormbase-live-server : CentOS 5, configured in particular for server use

wormbase-live-desktop : Ubuntu 6.06, configured for desktop users

About the WormBase core virtual machines

Core virtual machines are essentially production nodes virtualized. This makes them very convenient for development, testing, and even stop-gap emergency server recovery.

The key difference between these core virtual machines and their counterparts are the location of database directories. On live nodes, the databases are maintained on the local file structure. In the virtual machines, databases are maintained as separate virtual disks (VMDKs). This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.


To fetch the IP address for a virtaul machine, log on to the appropriate host, then:

vmware-cmd <cfg> getguestinfo "ip"

Directory structure of the Virtual Machines

Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch. For the core virtual machine, the directory structure looks like this:

During build:

wormbase-live-server/wormbase.vmx
WS180-database/acedb.vmdk
                   /elegans_gff.vmdk
                   /briggsae_gff.vmdk
                   /remanei_gff.vmdk
                   /autocomplete_gff.vmdk
                   /support.vmdk
current_databases -> WS180-databases

Thus, the virtual machine expects the databases to be located at :

 ../current_databases/acedb.vmdk

Updating the software

Software on the base virtual machines is kept in sync with the shell script ~wormbase/bin/pull_software.sh. This script syncs to the production nodes every day and is set to run under non-privileged cron every day.

* 2 * * * /home/wormbase/bin/pull_software.sh

Building virtual disks for a new release

Shutdown the core virtual machine:

vmare-cmd <cfg> shutdown

Run the prepare_virtual_machine.sh script:

 prepare_virtual_machine.sh WSXXX

This will set up a directory structure like this and untar some empty VMDKs:

 wormbase-live-server/wormbase.vmx
 WSXXX-databases/
 current_databases -> WSXXX-databases

Logon to the core virtual machine

Virtual Machine exceptions

To make creation and maintenance of virtual machines easier, I've changed some of the default settings in the core machines.

1. Acedb database location

WormBase proper:

/usr/local/acedb

WormBase core VMXs

/mnt/acedb

This requires a corresponding modification in the xinetd configuration.

2. MySQL datadir

WormBase proper:

/usr/local/mysql/data

WormBase core VMXs

/mnt/mysql

3. Support databases

WormBase proper

/usr/local/wormbase/database

WormBase core VMXs

/mnt/support_databases
/usr/local/wormbase/databases -> /mnt/support_databases

Updating the software after distribution

Once distributed Wormbase virtual machines can be kept up-to-date by a simple script. This has several advantages.

1. It keeps download sizes small

2. It keeps local configuation from being rewritten with every update.

3. It modularizes required databases so that users can choose what databases they wish to install.


Establishing the Virtual Machine

Build and install VMware Server (currently vers 1.0.3)

cd ~/build
tar xzf ../src/vmware-server-1.0.3.tar.gz
sudo ./vmware-install.pl

For WormBase, I place the virtual machines in /usr/local/vmx.

Installing the OS

Fetch a suitable ISO. From the console interface, edit options for the CD-ROM. Attach the ISO and make sure the "Connect on Startup" option is checked.

Users and groups

WormBase virtual machines have a slightly different user and group arrangment than we have traditionally used.

The main user is WormBase User:

Login: wormbase pass: wormbase home: /home/wormbase

To keep things copacetic with WormBase proper, I've created a symlink: /usr/local/wormbase -> /home/wormbase





Preparing a VMX for release

1. Start the guest OS.

2. In the guest, purge things like access logs, tarballs, etc

3. Shrink the disk in the disk by first zeroing empty space

 sudo dd if=/dev/zero of=/empty_file
 rm /empty_file

4. Set the VMX to graphical runlevel 5 (/etc/inittab)

5. Shutdown the guest

6. Copy the current wormbase-live to a directory named by release:

cp -r wormbase-live wormbase-WSXXX.YYYY.MM.DD

7. In the console, set the version and release date

      WormBase (WSXXX; DD Feb YYYY)

8. Defragment the disk from the VMWare console:

   Edit options > Hard Disk > Defrag the disk

9. In the VMWare console, set networking to NAT (assume desktop usage) and restart the guest.

10. Start the new VMX.

11. Reset the MAC address

12. Finish shrinking the disk using the vmware-toolbox:

 $ vmware-toolbox (select shrink)

13. When complete, shut down the VMX

14. Package

tar czf wormbase-WSXXX.YYYY.MM.DD.tgz

15. Symlink to make it available via http

cd /usr/local/wormbase/html/vmx ln -s /usr/local/vmx/wormbase-WSXXX.YYYY.MM.DD.tgz wormbase-WSXXXX.YYYY.MM.DD.tgz

16. Upload the new VM to BitTorrent

17. Update the [[Virtual_Machines Virtual Machines] page on the Wiki

Configuring VMXs as hosted frozen releases

To use a Virtual Machine as a server, a few small modifications need to be made.

1. From the VMWare Server console, launch the virtual machine

2. Set a static IP (must be assigned!)

In this example, the guest OS IP is 143.48.220.208. This should be changed to whatever your assigned IP address is.

ifconfig eth0:0 143.48.220.208 netmask 255.255.255.0 broadcast 143.48.220.255 route add -host 143.48.220.208 dev eth0

You can also do this from the GUI if you prefer, under System Settings -> Network. Double click on the network adaptor.

Address:  Your assigned IP address
Subnet mask: 255.255.255.0
Default gateway: 143.48.220.254
Broadcast host:  143.48.220.255 (not explicitly set in the GUI)

3. Reset the MAC ID of the guest

System Tools > Network

Double click on the network adaptor and select the "Hardware" tab. Click on "Probe", then "OK"

4. Add the following lines to /etc/resolve.conf for DNS

search cshl.edu
nameserver 143.48.1.1
nameserver 143.48.1.20

5. Set the hostname

This can be done either in the GUI under the Network panel, or using the following command line terms.

If you have a static IP address, then /etc/hosts is configured as follows:

127.0.0.1	           localhost.localdomain      localhost
143.488.220.44 mybox.mydomain.com	mybox

After updating the /etc/hosts file correctly, the "hostname" command should be run as follows to set your hostname:

hostname mybox.mydomain.com

6. Edit /usr/local/wormbase/conf/localdefs.pm and httpd.conf with the appropriate hostname


8. Shutdown the virtual machine and copy it as a backup

I append "server" to the name to indicate that it is configured as a server

  tar czf wormbase-WS100.2003.05.13-server.tgz wormbase-WS100.2003.05.13