Virtual Machines

From WormBaseWiki
Revision as of 02:19, 24 September 2007 by Tharris (talk | contribs) (New page: = Overview = "Virtual Machines" (or VMX) are self-contained packages of everything you need to use WormBase locally. To keep downloads small, we've split virtual machines into two compone...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Overview

"Virtual Machines" (or VMX) are self-contained packages of everything you need to use WormBase locally. To keep downloads small, we've split virtual machines into two components. First, you will need the primary virtual machine, which contains the WormBase software, within a Linux operating system.

Once you have the software, you can download which datasets you are interested in. We supply these as "virtual disks" -- consider these removable disks exactly like you might have sitting next to your laptop. Download only what you need, such as C. elegans predictions, WormMart, etc.

Virtual machines can be "played" on Mac, Windows, and Linux platforms using the free VMPlayer (Linux, Windows) or the (non-free how do you like them apples?) VMWare Fusion Player (Mac) available from VMWare.

Running a virtual machine of WormBase is incredibly easy.

  • Download the VMPlayer from VMWare
  • Download a WormBase virtual machine
  • Select the datasets you'd like to use and download them
  • Open the virtual machine using VMPlayer or VMFusion
  • Log in
  • A browser launches automatically and takes you to your copy of WormBase!

System Requirements

Available Virtual Machines

WormBase virtual machines are preconfigured for use on the desktop. This means that they are intended to be run and used on the same local machine. If you would like to run the virtual machine as a server you will need to make some small changes to the configuration. See the section "Running a WormBase Virtual Machine as a server" for additional details.

NOTES:

1. Due to their size and limited distribution, torrents of older releases may not be available. If you would like an older release, please send a request to Todd Harris (harris@cshl.edu).

2. We have been having trouble with BitTorrent seeds timing out. Until this problem is resolved, please fetch the virtual machine using HTTP (and a client like wget or curl so that you can resume a download if it fails.)

Release Date Size via .torrent via HTTP md5
WormBase WS175 27 May 2007 7.7 GB Available by request

WS175.vmx.tgz

md5 (e6615802e94133f8706d9177a71a521f)

WormBase WS174 06 May 2007 9.0 GB Available by request

WS174.vmx.tgz

md5 (b6d74790c486167eaa1e7083e53921d8)

WormBase WS173 15 Apr 2007 8.2 GB Available by request

WS173.vmx.tgz

md5 (7b3d6569a09a6e6d2ab9d6674c1d3af8)

WormBase WS172 25 Mar 2007 7.9 GB Available by request

WS172.vmx.tgz

md5 (531f2d1f1ee85d5d05a75e928494dd1a)

WormBase WS171 04 Mar 2007 8.5 GB Available by request

WS171.vmx.tgz

md5 (bf073383dc74e7292fef0f580c35b54a)

WormBase WS170 09 Feb 2007 6.6 GB Available by request

WS170.vmx.tgz

md5 (fa64d32a5dc3f9322a7a1db95aa0fe8b)

WormBase WS160 31 Jul 2006 7.1 GB Available by request

WS160.vmx.tgz

md5 (4aab931717dd384c05ef6bcae5e90ff9)

WormBase WS150 30 Nov 2005 6.2 GB Available by request

WS150.vmx.tgz

md5 (f30131fc6d2f5418e55c7c6baa847e6f)

WormBase WS140 26 Mar 2005 4.9 GB Available by request

WS140.vmx.tgz

md5 (262b4907d388107b24b1eded861fe214)

WormBase WS130 16 Aug 2004 4.0 GB Available by request

WS130.vmx.tgz

md5 (7644914a7ab10b8100b4cf5c54d177f2)

WormBase WS120 07 Mar 2004 3.5 GB Available by request

WS120.vmx.tgz

md5 (1727f2ac001e7d0f33502d99ab6606c2)

WormBase WS110 01 Oct 2003 3.6 GB Available by request

WS110.vmx.tgz

md5 (d01a4da5d751393f9b507fdaf2727ca5)

WormBase WS100 10 May 2003 3.3 GB Available by request

WS100.vmx.tgz

md5 (caa7290271451e38ff6600b517f49e18)

Download Instructions

Downloading via the command line client curl

1. Open a terminal window

2. Type the following commands

curl -O http://www.wormbase.org/vmx/WS150.vmx.tgz
curl -O http://www.wormbase.org/vmx/WS150.vmx.tgz.md5
md5sum --check WS150.vmx.tgz

Downloading via BitTorrent

Because of their size, virtual machines are available on the BitTorrent network. BitTorrent is a distributed peer-to-peer file sharing protocol that facilitates the download of large files. It does this by splitting files into many pieces. As each piece is fetched, its made available for others to download, doubling the number of copies of that segment. Instead of downloading one monolithic file from a single server, that same file can be downloaded piecemeal from potentially many servers.

1. Download and install a BitTorrent client

In order to fetch the WormBase BitTorrent files, you need either the official BitTorrent client or Azureus.

I recommend Azureus. Alternatively, you might try the official Bittorrent client.

2. Download one of the small BitTorrent files for a desired release

3. Open the .torrent file in the BitTorrent client.

NOTE: When you have finished downloading, please keep your BitTorrent client open in order to help share the load of distributing WormBase releases.

Fetching new releases automatically

WormBase creates new virtual machines for every release of the database. Your computer can fetch these for you automatically using the following script.

forthcoming - this script is still under development

Unpacking a downloaded torrent

Once you've completely downloaded one of the releases, unpack it using tar/gzip. I've had bad luck extracting the file with things like the BomArchiveHelper.

prompt> ls
-rw-rw-r--  1 todd todd 8.5G Mar  8 13:06 wormbase-WS171.2007.03.04.tgz
prompt> tar xzf wormbase-WS171.2007.03.04.tgz

Note: Do make sure that you have 30-50 GB of disk space free prior to unpacking!

Using a WormBase Virtual Machine

Starting the Virtual Machine

Launch your Virtual Player / VM Fusion. From the File menu, select "Open". Navigate to the unpacked directory of the WormBase Virtual Machine. Open it. Inside, you will find a file ending with the suffix ".vmx". Select it, then select OK. The Virtual Machine will start to boot up.

Logging in

Log in as

User: wormbase
Pass: wormbase

Navigating WormBase

After logging in, a browser window will appear taking you directly to your local copy of WormBase.

Advanced uses

Each WormBase Virtual Machine contains the entirety of WormBase. This makes it a great tool for tasks like data mining. You can also use individual components of WormBase to best suit your needs.

Xace - the graphical interface to AceDB

 /usr/local/acedb/bin/xace /usr/local/acedb/elegans

Tace - the text interface to AceDB

 /usr/local/acedb/bin/tace /usr/local/acedb/elegans

AcePerl - Perl API to AceDB

Mine the underlying AceDB database programmatically using AcePerl.

Bio::DB::GFF - Perl API to genomic annotations

Each virtual machine contains multiple mysql databases of genomic annotations, including (as of WS170, 2/2007), C. elegans, C. briggsae, and C. remanei. You can use the Bio::DB::GFF API -- part of BioPerl -- to easily mine these sequence annotations.

Running a WormBase Virtual Machine as a server

Each WormBase Virtual Machine contains everything that you need to run it as a standalone server. In server mode, you can install a single copy of WormBase locally and make it available to others in your lab or organization, or even establish your own local mirror of WormBase accessible to all.

To do this, you will need to modify a few settings of the virtual machine. The main consideration for running a virtual machine as a server is how to manage networking. Described below are the two most suitable approaches. Bridged networking is the easiest and requires no additional components.

Using a VM with bridged networking

When run in bridged networking mode, the VMX shares the host OSs internet connection. This allows the guest OS to run with its own domain name and IP address.

1. Acquire a static IP and suitable domain name for your virtual machine

2. When launching the virtual machine, for networking, select "Bridged".

3. Configure network settings with your IP, domain name, and subnet mask.

In this example, the guest OS IP is 143.48.220.208. This should be changed to whatever your assigned IP address is.

ifconfig eth0:0 143.48.220.208 netmask 255.255.255.0 broadcast 143.48.220.255 route add -host 143.48.220.208 dev eth0

You can also do this from the GUI if you prefer, under System Settings -> Network. Double click on the network adaptor.

Address:  Your assigned IP address
Subnet mask: 255.255.255.0
Default gateway: 143.48.220.254
Broadcast host:  143.48.220.255 (not explicitly set in the GUI)

4. Reset the MAC ID of the guest

System Tools > Network

Double click on the network adaptor and select the "Hardware" tab. Click on "Probe", then "OK"

5. Add the following lines to /etc/resolve.conf for DNS. For our example, these are:

search cshl.edu
nameserver 143.48.1.1
nameserver 143.48.1.20

6. Set the hostname

This can be done either in the GUI under the Network panel, or using the following command line terms.

If you have a static IP address, then /etc/hosts is configured as follows:

127.0.0.1	           localhost.localdomain      localhost
143.488.220.44 mybox.mydomain.com	mybox

After updating the /etc/hosts file correctly, the "hostname" command should be run as follows to set your hostname:

hostname mybox.mydomain.com

7. Edit /usr/local/wormbase/conf/localdefs.pm and httpd.conf with the appropriate hostname as necessary.

8. Place the VM into console mode:

Comment out the runlevel 5 directive and uncomment the id3 runlevel directive in /etc/inittab:

$ sudo perl -p -i -e 's/id:5:initdefault:/#id:5:initdefault:/' /etc/inittab
$ sudo perl -p -i -e 's/#id:3:initdefault:/id:5:initdefault:/' /etc/inittab

9. Shutdown the virtual machine and copy it as a backup

I append "server" to the name to indicate that it is configured as a server

  tar czf wormbase-WS100.2003.05.13-server.tgz wormbase-WS100.2003.05.13

10. Restart the Virtual Machine. You should be good to go!

Using a VM with NAT addressing

By default, WormBase virtual machines use NAT networking. If you would like to use a server in NAT mode, you will need to configure an upstream proxy server (Apache or Squid, for example) to access the virtual machine remotely.

Troubleshooting

Resetting the MAC address

MAC addresses are set automically in virtual machines based on the host OS name and the path to the .vmx file on the host OS. If you move the virtual machine, the MAC address will be incorrect.

When booted into graphical mode:

System Tools > Network

Double click on the network adaptor and select the "Hardware" tab. Click on "Probe", then "OK".