Difference between revisions of "How to build a WormBase Virtual Machine"

From WormBaseWiki
Jump to navigationJump to search
 
(5 intermediate revisions by one other user not shown)
Line 22: Line 22:
  
 
  vmware-cmd <cfg> shutdown
 
  vmware-cmd <cfg> shutdown
 +
 +
''Tip: The vmware-cmd command has a bunch of options for interacting with running VMXs.  Try vmware-cmd --help for information''
  
 
3. Attach some virtual disks
 
3. Attach some virtual disks
Line 42: Line 44:
 
7. Build the VMDKs
 
7. Build the VMDKs
  
  ~/wormbase/bin/build_vmdks.sh
+
  ~/wormbase/bin/build_vmdks.sh [VERSION]
 +
 
 +
''Note: you will need root privs on the VM to mount/unmount VMDKs and ssh access to transfer DBs from brie3!''
  
 
8. Shutdown the guest and package the new VMX from the host
 
8. Shutdown the guest and package the new VMX from the host
Line 48: Line 52:
 
  ./package_vmx.sh WS180 YYYY.MM.DD
 
  ./package_vmx.sh WS180 YYYY.MM.DD
  
 +
=Core Virtual Machines=
  
=Base Virtual Machines=
+
Currently, I maintain a one core virtual machine, running CentOS 5 configured in particular for use as a server.  It contains a lot of other superfluous things that make it useful as a desktop.
  
I maintain two base virtual machines.
+
This core virtual machine is essentially a production node virtualized.  This makes them very convenient for development, testing, and even stop-gap emergency server recovery.
  
wormbase-live-server : CentOS 5, configured in particular for server use
+
The key difference between the core machine and a production server is the location of database directories.  On live nodes, the databases are maintained on the local file structure.  In the virtual machines, databases are maintained as separate virtual disks (VMDKs).  This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.
  
wormbase-live-desktop : Ubuntu 6.06, configured for desktop users
+
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch. For the core virtual machine, the directory structure looks like this:
  
=About the WormBase core virtual machines=
+
During build:
  
Core virtual machines are essentially production nodes virtualized. This makes them very convenient for development, testing, and even stop-gap emergency server recovery.
+
  WSXXX/
 +
    |
 +
    --wormbase-live-server/wormbase.vmx
 +
    --database/
 +
      |
 +
        --acedb/
 +
        --autocomplete/
 +
        --c_elegans/
 +
        --other_species/
 +
        --support/
 +
    --current_databases -> databases
  
The key difference between these core virtual machines and their counterparts are the location of database directories.  On live nodes, the databases are maintained on the local file structure.  In the virtual machines, databases are maintained as separate virtual disks (VMDKs).  This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.
+
Thus, the virtual machine expects the databases to be located at :
  
To fetch the IP address for a virtaul machine, log on to the appropriate host, then:
+
  ../current_databases/acedb/20GB.vmdk
  
vmware-cmd <cfg> getguestinfo "ip"
 
  
= Directory structure of the Virtual Machines =
+
== Users and groups ==
 +
The core virtual machines have the following users and groups:
  
Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch.  For the core virtual machine, the directory structure looks like this:
+
The main user is WormBase User:
  
During build:
+
Login: wormbase
 
+
  pass: wormbase
  wormbase-live-server/wormbase.vmx
+
  home: /home/wormbase
  WS180-database/acedb.vmdk
 
                    /elegans_gff.vmdk
 
                    /briggsae_gff.vmdk
 
                    /remanei_gff.vmdk
 
                    /autocomplete_gff.vmdk
 
                    /support.vmdk
 
current_databases -> WS180-databases
 
 
 
Thus, the virtual machine expects the databases to be located at :
 
  
  ../current_databases/acedb.vmdk
+
root is wermbace. Don't tell anyone.
  
 
=Updating the software=
 
=Updating the software=
Line 92: Line 98:
 
  * 2 * * * /home/wormbase/bin/pull_software.sh
 
  * 2 * * * /home/wormbase/bin/pull_software.sh
  
= Building virtual disks for a new release =
+
= Updating the databases =
 +
 
 +
Updating the databases for the core virtual machine is a bit of a misnomer.  What we will really do is populate new empty virtual disks with the current databases.  Here's how.
  
 
Shutdown the core virtual machine:
 
Shutdown the core virtual machine:
  
vmare-cmd <cfg> shutdown
+
vmare-cmd <cfg> shutdown
  
 
Run the prepare_virtual_machine.sh script:
 
Run the prepare_virtual_machine.sh script:
Line 105: Line 113:
  
 
   wormbase-live-server/wormbase.vmx
 
   wormbase-live-server/wormbase.vmx
   WSXXX-databases/
+
   databases/
   current_databases -> WSXXX-databases
+
   current_databases -> databases
  
 
It's important that the databases maintain this relative structure or they will not be available to the VMX.
 
It's important that the databases maintain this relative structure or they will not be available to the VMX.
Line 118: Line 126:
 
You will need to be me.  Sorry, I haven't fixed this yet.
 
You will need to be me.  Sorry, I haven't fixed this yet.
  
=== Establishing the Virtual Machine ===
+
= House-cleaning of the core virtual machine =
 
 
Build and install VMware Server (currently vers 1.0.3)
 
 
 
cd ~/build
 
tar xzf ../src/vmware-server-1.0.3.tar.gz
 
sudo ./vmware-install.pl
 
 
For WormBase, I place the virtual machines in /usr/local/vmx.
 
 
 
=== Installing the OS ===
 
 
 
Fetch a suitable ISO.  From the console interface, edit options for the CD-ROM.  Attach the ISO and make sure the "Connect on Startup" option is checked.
 
 
 
== Users and groups ==
 
WormBase virtual machines have a slightly different user and group arrangment than we have traditionally used.
 
 
 
The main user is WormBase User:
 
 
 
Login: wormbase
 
pass: wormbase
 
home: /home/wormbase
 
  
To keep things copacetic with WormBase proper, I've created a symlink:
+
It's good to periodically clean the guest OS.  This includes defragging and purging temporary files to keep the size of the virtual machine in check.  Here's a general outline.
/usr/local/wormbase -> /home/wormbase
 
 
 
 
 
 
 
 
 
=== Preparing a VMX for release ===
 
 
 
Periodically, it's good to shrink the size of the core WormBase virtual machines.  
 
  
 
1. Start the guest OS.
 
1. Start the guest OS.
Line 171: Line 150:
  
 
   todd> vmware-toolbox (select shrink)
 
   todd> vmware-toolbox (select shrink)
 +
 +
 +
 +
[[Category:User Guide]]
 +
[[Category:Developer documentation]]

Latest revision as of 23:33, 13 August 2010

Overview

WormBase Virtual Machines are created for each release of the database. This process is almost entirely scripted, created from base virtual machines that run all the time and are automatically kept up-to-date with the production nodes.

To simplify the download and update process, WormBase virtual machines are split into modules. The primary virtual machine contains all software and configuration running under CentOS 5 (for servers) or Ubuntu 6.06 (for desktops). Databases are maintained as virtual disks (VMDKs).

Creation of a new VM requires three steps:

1. Syncing the software to the staging rsync module hosted on the main WormBase development site.

2. Creation of VMDKs for available databases.

3. Tarring and gzipping.

The process is described in more detail below.

Quick Start Guide

1. Log on to the host machine

2. Shutdown the appropriate guest.

vmware-cmd <cfg> shutdown

Tip: The vmware-cmd command has a bunch of options for interacting with running VMXs. Try vmware-cmd --help for information

3. Attach some virtual disks

./prepare_virtual_machine.sh WSXXX

4. Reboot the guest

vmware-cmd <cfg> start

5. Log on to the guest

vmware-cmd <cfg> getguestinfo "ip"
ssh wormbase@[ip] ; pass = wormbase

6. Update the software (optional: should already be up-to-date as this runs under cron)

~/wormbase/bin/pull_software.sh

7. Build the VMDKs

~/wormbase/bin/build_vmdks.sh [VERSION]

Note: you will need root privs on the VM to mount/unmount VMDKs and ssh access to transfer DBs from brie3!

8. Shutdown the guest and package the new VMX from the host

./package_vmx.sh WS180 YYYY.MM.DD

Core Virtual Machines

Currently, I maintain a one core virtual machine, running CentOS 5 configured in particular for use as a server. It contains a lot of other superfluous things that make it useful as a desktop.

This core virtual machine is essentially a production node virtualized. This makes them very convenient for development, testing, and even stop-gap emergency server recovery.

The key difference between the core machine and a production server is the location of database directories. On live nodes, the databases are maintained on the local file structure. In the virtual machines, databases are maintained as separate virtual disks (VMDKs). This makes it possible to update the software and databases independently, a great advantage when it comes to maintain mirror sites.

Since databases are maintained as virtual disks, the virtual machine needs to know where to find them in order to launch. For the core virtual machine, the directory structure looks like this:

During build:

WSXXX/
    |
    --wormbase-live-server/wormbase.vmx
    --database/
      |
       --acedb/
       --autocomplete/
       --c_elegans/
       --other_species/
       --support/
    --current_databases -> databases

Thus, the virtual machine expects the databases to be located at :

 ../current_databases/acedb/20GB.vmdk


Users and groups

The core virtual machines have the following users and groups:

The main user is WormBase User:

Login: wormbase
pass: wormbase
home: /home/wormbase

root is wermbace. Don't tell anyone.

Updating the software

Software on the base virtual machines is kept in sync with the shell script ~wormbase/bin/pull_software.sh. This script syncs to the production nodes every day and is set to run under non-privileged cron every day.

* 2 * * * /home/wormbase/bin/pull_software.sh

Updating the databases

Updating the databases for the core virtual machine is a bit of a misnomer. What we will really do is populate new empty virtual disks with the current databases. Here's how.

Shutdown the core virtual machine:

vmare-cmd <cfg> shutdown

Run the prepare_virtual_machine.sh script:

 prepare_virtual_machine.sh WSXXX

This will set up a directory structure like this and untar some empty VMDKs:

 wormbase-live-server/wormbase.vmx
 databases/
 current_databases -> databases

It's important that the databases maintain this relative structure or they will not be available to the VMX.

Reboot and log on to the core virtual machine.

Run the database script.

~wormbase/bin/build_vmdks.sh WSXXX

You will need to be me. Sorry, I haven't fixed this yet.

House-cleaning of the core virtual machine

It's good to periodically clean the guest OS. This includes defragging and purging temporary files to keep the size of the virtual machine in check. Here's a general outline.

1. Start the guest OS.

2. In the guest, purge things like access logs, tarballs, etc

3. Shrink the disk in the guest by first zeroing empty space

 todd> sudo dd if=/dev/zero of=/empty_file
 todd> rm /empty_file

4. Shutdown the guest

5. Defragment the disk from the VMWare console:

   Edit options > Hard Disk > Defrag the disk

6. Restart the guest

7. Finish shrinking the disk using the vmware-toolbox:

 todd> vmware-toolbox (select shrink)