Difference between revisions of "How to build a WormBase Virtual Machine"

From WormBaseWiki
Jump to navigationJump to search
 
(13 intermediate revisions by one other user not shown)
Line 15: Line 15:
 
The process is described in more detail below.
 
The process is described in more detail below.
  
=Base Virtual Machines=
+
= Quick Start Guide =
  
I maintain two base virtual machines.
+
1. Log on to the host machine
  
wormbase-live-server : CentOS 5, configured in particular for server use
+
2. Shutdown the appropriate guest.
  
wormbase-live-desktop : Ubuntu 6.06, configured for desktop users
+
vmware-cmd <cfg> shutdown
  
=About the WormBase core virtual machines=
+
''Tip: The vmware-cmd command has a bunch of options for interacting with running VMXs.  Try vmware-cmd --help for information''
  
Core virtual machines are essentially production nodes virtualized.  This makes them very convenient for development, testing, and even stop-gap emergency server recovery. The key differences between production nodes and the core virtual machines are:
+
3. Attach some virtual disks
  
* Database are maintained as virtual disks instead of in the main virtual machine itself.
+
./prepare_virtual_machine.sh WSXXX
* Perl libraries are maintained in private directories (/usr/local/wormbase/extlib), with corresponding modifications to perl.startup. This allows me to push new modules onto existing virtual machine installations with ease.
 
  
I maintain two variations of the WormBase virtual machines.
+
4. Reboot the guest
  
wormbase-live-desktop:
+
  vmware-cmd <cfg> start
Operating system: Ubuntu 6.06-desktop
 
  
wormbase-live-server:
+
5. Log on to the guest
Operating system: CentOS 5
 
  
The WormBase core virtual machines
+
vmware-cmd <cfg> getguestinfo "ip"
 +
ssh wormbase@[ip] ; pass = wormbase
  
 +
6. Update the software (optional: should already be up-to-date as this runs under cron)
  
To fetch the IP address for a virtaul machine, log on to the appropriate host, then:
+
~/wormbase/bin/pull_software.sh
  
vmware-cmd <cfg> getguestinfo "ip"
+
7. Build the VMDKs
  
= Directory structure of the Virtual Machines =
+
~/wormbase/bin/build_vmdks.sh [VERSION]
  
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch.  For the core virtual machine, the directory structure looks like this:
+
''Note: you will need root privs on the VM to mount/unmount VMDKs and ssh access to transfer DBs from brie3!''
  
  wormbase-live-server/wormbase.vmx
+
8. Shutdown the guest and package the new VMX from the host
  WS180-database/acedb.vmdk
 
                            /elegans_gff.vmdk
 
                            /briggsae_gff.vmdk
 
                          /remanei_gff.vmdk
 
                          /autocomplete_gff.vmdk
 
                          /support.vmdk
 
  current_databases -> WS180-databases
 
  
Thus, the virtual machine expects the databases to be located at :
+
./package_vmx.sh WS180 YYYY.MM.DD
  
  ../current_databases/acedb.vmdk
+
=Core Virtual Machines=
  
=Updating the software=
+
Currently, I maintain a one core virtual machine, running CentOS 5 configured in particular for use as a server.  It contains a lot of other superfluous things that make it useful as a desktop.
  
Software on the base virtual machines is kept in sync with the shell script ''~wormbase/bin/pull_software.sh''.  This script syncs to the production nodes every day and is set to run under non-privileged cron every day.
+
This core virtual machine is essentially a production node virtualized.  This makes them very convenient for development, testing, and even stop-gap emergency server recovery.
  
  * 2 * * * /home/wormbase/bin/pull_software.sh
+
The key difference between the core machine and a production server is the location of database directories.  On live nodes, the databases are maintained on the local file structure.  In the virtual machines, databases are maintained as separate virtual disks (VMDKs). This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.
  
= Building virtual disks for a new release =
+
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch.  For the core virtual machine, the directory structure looks like this:
  
Shutdown the core virtual machine:
+
During build:
  
vmare-cmd <cfg> shutdown
+
WSXXX/
 +
    |
 +
    --wormbase-live-server/wormbase.vmx
 +
    --database/
 +
      |
 +
        --acedb/
 +
        --autocomplete/
 +
        --c_elegans/
 +
        --other_species/
 +
        --support/
 +
    --current_databases -> databases
  
Run the prepare_virtual_machine.sh script:
+
Thus, the virtual machine expects the databases to be located at :
  
   prepare_virtual_machine.sh WSXXX
+
   ../current_databases/acedb/20GB.vmdk
  
This will set up a directory structure like this and untar some empty VMDKs:
 
  
  wormbase-live-server/wormbase.vmx
+
== Users and groups ==
  WSXXX-databases/
+
The core virtual machines have the following users and groups:
  current_databases -> WSXXX-databases
 
  
Logon to the core virtual machine
+
The main user is WormBase User:
  
=Virtual Machine exceptions=
+
Login: wormbase
 +
pass: wormbase
 +
home: /home/wormbase
  
To make creation and maintenance of virtual machines easier, I've changed some of the default settings in the core machines.
+
root is wermbace.  Don't tell anyone.
  
1. Acedb database location
+
=Updating the software=
  
WormBase proper:
+
Software on the base virtual machines is kept in sync with the shell script ''~wormbase/bin/pull_software.sh''.  This script syncs to the production nodes every day and is set to run under non-privileged cron every day.
  
  /usr/local/acedb
+
  * 2 * * * /home/wormbase/bin/pull_software.sh
  
WormBase core VMXs
+
= Updating the databases =
  
  /mnt/acedb
+
Updating the databases for the core virtual machine is a bit of a misnomer. What we will really do is populate new empty virtual disks with the current databases.  Here's how.
  
This requires a corresponding modification in the xinetd configuration.
+
Shutdown the core virtual machine:
  
2. MySQL datadir
+
  vmare-cmd <cfg> shutdown
  
WormBase proper:
+
Run the prepare_virtual_machine.sh script:
  
/usr/local/mysql/data
+
  prepare_virtual_machine.sh WSXXX
  
WormBase core VMXs
+
This will set up a directory structure like this and untar some empty VMDKs:
  
/mnt/mysql
+
  wormbase-live-server/wormbase.vmx
 
+
  databases/
3. Support databases
+
  current_databases -> databases
 
 
WormBase proper
 
 
 
/usr/local/wormbase/database
 
 
 
WormBase core VMXs
 
 
 
/mnt/support_databases
 
/usr/local/wormbase/databases -> /mnt/support_databases
 
 
 
=Updating the software after distribution=
 
 
 
Once distributed Wormbase virtual machines can be kept up-to-date by a simple script. This has several advantages.
 
 
 
1. It keeps download sizes small
 
 
 
2. It keeps local configuation from being rewritten with every update.
 
 
 
3. It modularizes required databases so that users can choose what databases they wish to install.
 
 
 
 
 
=== Establishing the Virtual Machine ===
 
 
 
Build and install VMware Server (currently vers 1.0.3)
 
  
cd ~/build
+
It's important that the databases maintain this relative structure or they will not be available to the VMX.
tar xzf ../src/vmware-server-1.0.3.tar.gz
 
sudo ./vmware-install.pl
 
 
For WormBase, I place the virtual machines in /usr/local/vmx.
 
  
=== Installing the OS ===
+
Reboot and log on to the core virtual machine.
  
Fetch a suitable ISO.  From the console interface, edit options for the CD-ROM.  Attach the ISO and make sure the "Connect on Startup" option is checked.
+
Run the database script.
  
== Users and groups ==
+
~wormbase/bin/build_vmdks.sh WSXXX
WormBase virtual machines have a slightly different user and group arrangment than we have traditionally used.  
 
  
The main user is WormBase User:
+
You will need to be me.  Sorry, I haven't fixed this yet.
  
Login: wormbase
+
= House-cleaning of the core virtual machine =
pass: wormbase
 
home: /home/wormbase
 
  
To keep things copacetic with WormBase proper, I've created a symlink:
+
It's good to periodically clean the guest OS.  This includes defragging and purging temporary files to keep the size of the virtual machine in check.  Here's a general outline.
/usr/local/wormbase -> /home/wormbase
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
=== Preparing a VMX for release ===
 
  
 
1. Start the guest OS.
 
1. Start the guest OS.
Line 173: Line 134:
 
2. In the guest, purge things like access logs, tarballs, etc
 
2. In the guest, purge things like access logs, tarballs, etc
  
3. Shrink the disk in the disk by first zeroing empty space
+
3. Shrink the disk in the guest by first zeroing empty space
 
 
  sudo dd if=/dev/zero of=/empty_file
 
  rm /empty_file
 
 
 
4. Set the VMX to graphical runlevel 5 (/etc/inittab)
 
 
 
5. Shutdown the guest
 
 
 
6. Copy the current wormbase-live to a directory named by release:
 
 
 
cp -r wormbase-live wormbase-WSXXX.YYYY.MM.DD
 
  
7. In the console, set the version and release date
+
  todd> sudo dd if=/dev/zero of=/empty_file
 +
  todd> rm /empty_file
  
      WormBase (WSXXX; DD Feb YYYY)
+
4. Shutdown the guest
  
8. Defragment the disk from the VMWare console:
+
5. Defragment the disk from the VMWare console:
  
 
     Edit options &gt; Hard Disk &gt; Defrag the disk
 
     Edit options &gt; Hard Disk &gt; Defrag the disk
  
9. In the VMWare console, set networking to NAT (assume desktop usage) and restart the guest.
+
6. Restart the guest
 
 
10. Start the new VMX.
 
 
 
11. Reset the MAC address
 
 
 
12. Finish shrinking the disk using the vmware-toolbox:
 
 
 
  $ vmware-toolbox (select shrink)
 
 
 
13. When complete, shut down the VMX
 
 
 
14. Package
 
 
 
tar czf wormbase-WSXXX.YYYY.MM.DD.tgz
 
 
 
15. Symlink to make it available via http
 
 
 
cd /usr/local/wormbase/html/vmx ln -s /usr/local/vmx/wormbase-WSXXX.YYYY.MM.DD.tgz wormbase-WSXXXX.YYYY.MM.DD.tgz
 
 
 
16. Upload the new VM to BitTorrent
 
 
 
17. Update the [[Virtual_Machines Virtual Machines] page on the Wiki
 
 
 
=== Configuring VMXs as hosted frozen releases ===
 
 
 
To use a Virtual Machine as a server, a few small modifications need to be made.
 
 
 
1. From the VMWare Server console, launch the virtual machine
 
 
 
2. Set a static IP (must be assigned!)
 
 
 
In this example, the guest OS IP is 143.48.220.208. This should be changed to whatever your assigned IP address is.
 
 
 
ifconfig eth0:0 143.48.220.208 netmask 255.255.255.0 broadcast 143.48.220.255 route add -host 143.48.220.208 dev eth0
 
 
 
You can also do this from the GUI if you prefer, under System Settings -&gt; Network. Double click on the network adaptor.
 
 
 
Address:  Your assigned IP address
 
Subnet mask: 255.255.255.0
 
Default gateway: 143.48.220.254
 
Broadcast host:  143.48.220.255 (not explicitly set in the GUI)
 
 
 
3. Reset the MAC ID of the guest
 
 
 
System Tools &gt; Network
 
 
 
Double click on the network adaptor and select the "Hardware" tab. Click on "Probe", then "OK"
 
 
 
4. Add the following lines to /etc/resolve.conf for DNS
 
 
 
search cshl.edu
 
nameserver 143.48.1.1
 
nameserver 143.48.1.20
 
 
 
5. Set the hostname
 
 
 
This can be done either in the GUI under the Network panel, or using the following command line terms.
 
 
 
If you have a static IP address, then /etc/hosts is configured as follows:
 
 
 
127.0.0.1           localhost.localdomain      localhost
 
143.488.220.44 mybox.mydomain.com mybox
 
 
 
After updating the /etc/hosts file correctly, the "hostname" command should be run as follows to set your hostname:
 
 
 
hostname mybox.mydomain.com
 
  
6. Edit /usr/local/wormbase/conf/localdefs.pm and httpd.conf with the appropriate hostname
+
7. Finish shrinking the disk using the vmware-toolbox:
  
 +
  todd> vmware-toolbox (select shrink)
  
8. Shutdown the virtual machine and copy it as a backup
 
  
''I append "server" to the name to indicate that it is configured as a server''
 
  
  tar czf wormbase-WS100.2003.05.13-server.tgz wormbase-WS100.2003.05.13
+
[[Category:User Guide]]
 +
[[Category:Developer documentation]]

Latest revision as of 23:33, 13 August 2010

Overview

WormBase Virtual Machines are created for each release of the database. This process is almost entirely scripted, created from base virtual machines that run all the time and are automatically kept up-to-date with the production nodes.

To simplify the download and update process, WormBase virtual machines are split into modules. The primary virtual machine contains all software and configuration running under CentOS 5 (for servers) or Ubuntu 6.06 (for desktops). Databases are maintained as virtual disks (VMDKs).

Creation of a new VM requires three steps:

1. Syncing the software to the staging rsync module hosted on the main WormBase development site.

2. Creation of VMDKs for available databases.

3. Tarring and gzipping.

The process is described in more detail below.

Quick Start Guide

1. Log on to the host machine

2. Shutdown the appropriate guest.

vmware-cmd <cfg> shutdown

Tip: The vmware-cmd command has a bunch of options for interacting with running VMXs. Try vmware-cmd --help for information

3. Attach some virtual disks

./prepare_virtual_machine.sh WSXXX

4. Reboot the guest

vmware-cmd <cfg> start

5. Log on to the guest

vmware-cmd <cfg> getguestinfo "ip"
ssh wormbase@[ip] ; pass = wormbase

6. Update the software (optional: should already be up-to-date as this runs under cron)

~/wormbase/bin/pull_software.sh

7. Build the VMDKs

~/wormbase/bin/build_vmdks.sh [VERSION]

Note: you will need root privs on the VM to mount/unmount VMDKs and ssh access to transfer DBs from brie3!

8. Shutdown the guest and package the new VMX from the host

./package_vmx.sh WS180 YYYY.MM.DD

Core Virtual Machines

Currently, I maintain a one core virtual machine, running CentOS 5 configured in particular for use as a server. It contains a lot of other superfluous things that make it useful as a desktop.

This core virtual machine is essentially a production node virtualized. This makes them very convenient for development, testing, and even stop-gap emergency server recovery.

The key difference between the core machine and a production server is the location of database directories. On live nodes, the databases are maintained on the local file structure. In the virtual machines, databases are maintained as separate virtual disks (VMDKs). This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.

Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch. For the core virtual machine, the directory structure looks like this:

During build:

WSXXX/
    |
    --wormbase-live-server/wormbase.vmx
    --database/
      |
       --acedb/
       --autocomplete/
       --c_elegans/
       --other_species/
       --support/
    --current_databases -> databases

Thus, the virtual machine expects the databases to be located at :

 ../current_databases/acedb/20GB.vmdk


Users and groups

The core virtual machines have the following users and groups:

The main user is WormBase User:

Login: wormbase
pass: wormbase
home: /home/wormbase

root is wermbace. Don't tell anyone.

Updating the software

Software on the base virtual machines is kept in sync with the shell script ~wormbase/bin/pull_software.sh. This script syncs to the production nodes every day and is set to run under non-privileged cron every day.

* 2 * * * /home/wormbase/bin/pull_software.sh

Updating the databases

Updating the databases for the core virtual machine is a bit of a misnomer. What we will really do is populate new empty virtual disks with the current databases. Here's how.

Shutdown the core virtual machine:

vmare-cmd <cfg> shutdown

Run the prepare_virtual_machine.sh script:

 prepare_virtual_machine.sh WSXXX

This will set up a directory structure like this and untar some empty VMDKs:

 wormbase-live-server/wormbase.vmx
 databases/
 current_databases -> databases

It's important that the databases maintain this relative structure or they will not be available to the VMX.

Reboot and log on to the core virtual machine.

Run the database script.

~wormbase/bin/build_vmdks.sh WSXXX

You will need to be me. Sorry, I haven't fixed this yet.

House-cleaning of the core virtual machine

It's good to periodically clean the guest OS. This includes defragging and purging temporary files to keep the size of the virtual machine in check. Here's a general outline.

1. Start the guest OS.

2. In the guest, purge things like access logs, tarballs, etc

3. Shrink the disk in the guest by first zeroing empty space

 todd> sudo dd if=/dev/zero of=/empty_file
 todd> rm /empty_file

4. Shutdown the guest

5. Defragment the disk from the VMWare console:

   Edit options > Hard Disk > Defrag the disk

6. Restart the guest

7. Finish shrinking the disk using the vmware-toolbox:

 todd> vmware-toolbox (select shrink)