How to build a WormBase Virtual Machine
- 1 Overview
- 2 Quick Start Guide
- 3 Base Virtual Machines
- 4 About the WormBase core virtual machines
- 5 Directory structure of the Virtual Machines
- 6 Updating the software
- 7 Building virtual disks for a new release
- 8 Updating the software after distribution
WormBase Virtual Machines are created for each release of the database. This process is almost entirely scripted, created from base virtual machines that run all the time and are automatically kept up-to-date with the production nodes.
To simplify the download and update process, WormBase virtual machines are split into modules. The primary virtual machine contains all software and configuration running under CentOS 5 (for servers) or Ubuntu 6.06 (for desktops). Databases are maintained as virtual disks (VMDKs).
Creation of a new VM requires three steps:
1. Syncing the software to the staging rsync module hosted on the main WormBase development site.
2. Creation of VMDKs for available databases.
3. Tarring and gzipping.
The process is described in more detail below.
Quick Start Guide
1. Log on to the host machine
2. Shutdown the appropriate guest.
vmware-cmd <cfg> shutdown
3. Attach some virtual disks
4. Reboot the guest
vmware-cmd <cfg> start
5. Log on to the guest
vmware-cmd <cfg> getguestinfo "ip" ssh wormbase@[ip] ; pass = wormbase
6. Update the software (optional: should already be up-to-date as this runs under cron)
7. Build the VMDKs
8. Shutdown the guest and package the new VMX from the host
./package_vmx.sh WS180 YYYY.MM.DD
Base Virtual Machines
I maintain two base virtual machines.
wormbase-live-server : CentOS 5, configured in particular for server use
wormbase-live-desktop : Ubuntu 6.06, configured for desktop users
About the WormBase core virtual machines
Core virtual machines are essentially production nodes virtualized. This makes them very convenient for development, testing, and even stop-gap emergency server recovery.
The key difference between these core virtual machines and their counterparts are the location of database directories. On live nodes, the databases are maintained on the local file structure. In the virtual machines, databases are maintained as separate virtual disks (VMDKs). This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.
To fetch the IP address for a virtaul machine, log on to the appropriate host, then:
vmware-cmd <cfg> getguestinfo "ip"
Directory structure of the Virtual Machines
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch. For the core virtual machine, the directory structure looks like this:
wormbase-live-server/wormbase.vmx WS180-database/acedb.vmdk /elegans_gff.vmdk /briggsae_gff.vmdk /remanei_gff.vmdk /autocomplete_gff.vmdk /support.vmdk current_databases -> WS180-databases
Thus, the virtual machine expects the databases to be located at :
Updating the software
Software on the base virtual machines is kept in sync with the shell script ~wormbase/bin/pull_software.sh. This script syncs to the production nodes every day and is set to run under non-privileged cron every day.
* 2 * * * /home/wormbase/bin/pull_software.sh
Building virtual disks for a new release
Shutdown the core virtual machine:
vmare-cmd <cfg> shutdown
Run the prepare_virtual_machine.sh script:
This will set up a directory structure like this and untar some empty VMDKs:
wormbase-live-server/wormbase.vmx WSXXX-databases/ current_databases -> WSXXX-databases
It's important that the databases maintain this relative structure or they will not be available to the VMX.
Reboot and log on to the core virtual machine.
Run the database script.
You will need to be me. Sorry, I haven't fixed this yet.
Updating the software after distribution
Once distributed Wormbase virtual machines can be kept up-to-date by a simple script. This has several advantages.
1. It keeps download sizes small
2. It keeps local configuation from being rewritten with every update.
3. It modularizes required databases so that users can choose what databases they wish to install.
Establishing the Virtual Machine
Build and install VMware Server (currently vers 1.0.3)
cd ~/build tar xzf ../src/vmware-server-1.0.3.tar.gz sudo ./vmware-install.pl
For WormBase, I place the virtual machines in /usr/local/vmx.
Installing the OS
Fetch a suitable ISO. From the console interface, edit options for the CD-ROM. Attach the ISO and make sure the "Connect on Startup" option is checked.
Users and groups
WormBase virtual machines have a slightly different user and group arrangment than we have traditionally used.
The main user is WormBase User:
Login: wormbase pass: wormbase home: /home/wormbase
To keep things copacetic with WormBase proper, I've created a symlink: /usr/local/wormbase -> /home/wormbase
Preparing a VMX for release
Periodically, it's good to shrink the size of the core WormBase virtual machines.
1. Start the guest OS.
2. In the guest, purge things like access logs, tarballs, etc
3. Shrink the disk in the guest by first zeroing empty space
todd> sudo dd if=/dev/zero of=/empty_file todd> rm /empty_file
4. Shutdown the guest
5. Defragment the disk from the VMWare console:
Edit options > Hard Disk > Defrag the disk
6. Restart the guest
7. Finish shrinking the disk using the vmware-toolbox:
todd> vmware-toolbox (select shrink)