Difference between revisions of "Administration:Installing WormMine"

From WormBaseWiki
Jump to navigationJump to search
(Instructions added for working with the development version of WormBase InterMine.)
(Basic instructions on stopping/starting Tomcat.)
Line 52: Line 52:
 
=== Tomcat ===
 
=== Tomcat ===
 
Refer to [http://intermine.readthedocs.org/en/latest/system-requirements/software/tomcat/ Tomcat InterMine installation]
 
Refer to [http://intermine.readthedocs.org/en/latest/system-requirements/software/tomcat/ Tomcat InterMine installation]
 +
 +
After installing it:
 +
 +
<pre>
 +
export CATALINA_HOME=/YOURPATH/apache-tomcat-6.0.36
 +
</pre>
 +
 +
Starting Tomcat:
 +
 +
<pre>
 +
$CATALINA_HOME/bin/startup.sh
 +
</pre>
 +
 +
Stopping Tomcat:
 +
 +
<pre>
 +
$CATALINA_HOME/bin/shutdown.sh
 +
</pre>
  
 
=== PostgreSQL ===
 
=== PostgreSQL ===

Revision as of 16:29, 16 October 2013

How to set up a development instance of WormMine

Requirements

Hardware

Linux

  • 8 cores
  • 24GB RAM
  • ~ 1TB storage

Software

Necessary software and versions:

Software Minimum Version Purpose
Git 1.7 check out and update source code
Java SDK 6.0 build and use InterMine
Ant 1.8 invokes the InterMine build
Tomcat 6.0.29 website
PostgreSQL 8.3 database
Perl 5.8.8 run build scripts

Installation / configuration

Dependencies

Git

Install the command line tool:

$ sudo apt-get install git-core

Configure your user and email:

$ git config --global user.name "Name Surname"
$ git config --global user.email "your.email@gmail.com"

Java

Download here. Since InterMine can be memory intensive, it's helpful to pass environment variables to ant through the ANT_OPTS variable.

$ export ANT_OPTS="-server -XX:MaxPermSize=256M -Xmx1700m -XX:+UseParallelGC
-Xms1700m -XX:SoftRefLRUPolicyMSPerMB=1 -XX:MaxHeapFreeRatio=99"

Ant

Refer to ant's manual for installation instructions.

Tomcat

Refer to Tomcat InterMine installation

After installing it:

export CATALINA_HOME=/YOURPATH/apache-tomcat-6.0.36

Starting Tomcat:

$CATALINA_HOME/bin/startup.sh

Stopping Tomcat:

$CATALINA_HOME/bin/shutdown.sh

PostgreSQL

Refer to InterMine PostgreSQL installation guide

Perl

Refer to InterMine Perl installation guide

Download and Install WormMine

Navigate into the folder you want to install WormMine

Download code from Git:

~]$ git clone https://github.com/WormBase/website-intermine.git
Cloning into website-intermine...
remote: Counting objects: 2866, done.
remote: Compressing objects: 100% (1054/1054), done.
remote: Total 2866 (delta 1728), reused 2837 (delta 1699)
Receiving objects: 100% (2866/2866), 25.28 MiB | 4.65 MiB/s, done.
Resolving deltas: 100% (1728/1728), done.

This downloads the entire intermine project repository. The mine itself is a submodule of this.

Initialize submodule

~]$ cd website-intermine
website-intermine]$ git submodule update --init
Submodule 'acedb-dev/intermine' (git@github.com:WormBase/intermine.git) registered for path 'acedb-dev/intermine'
Cloning into acedb-dev/intermine...
Warning: Permanently added the RSA host key for IP address '192.30.252.131' to the list of known hosts.
remote: Counting objects: 379065, done.
remote: Compressing objects: 100% (78659/78659), done.
remote: Total 379065 (delta 233912), reused 377610 (delta 232792)
Receiving objects: 100% (379065/379065), 685.14 MiB | 6.47 MiB/s, done.
Resolving deltas: 100% (233912/233912), done. 
Submodule path 'acedb-dev/intermine': checked out 'd640534eda614d60558c6561da6fb9311d6ad893'

This populates the intermine directory at website-intermine/acedb-dev/intermine It needs to be set to the proper branch.

Navigate to the mine:

website-intermine]$ cd acedb-dev/intermine/

Working with the Development Branch

Get the "website-intermine" repo as before, but do not init the submodule. Instead proceed as follows:

git clone https://github.com/WormBase/website-intermine.git
git clone https://github.com/WormBase/intermine

# Create link to the InterMine sources within the web-site context. Replaces sub-module.
cd website-intermine/acedb-dev
rmdir intermine
ln -s ../../intermine .
cd ../../intermine

# Get development branch.
git remote add unmerged https://github.com/WormBase/intermine/tree/unmerged
git checkout unmerged

Create properties file

  • Create ~/.intermine directory.
  • Copy the sample properties file as ~/.intermine/wormmine.properties
  • Fill in placeholders as follows:
    • <POSTGRES USER PASSWORD>: postgres password for intermine user
    • <TOMCAT USER PASSWORD>: tomcat password for intermine user
    • <SERVER PUBLIC BASE URL>: Base url of your web server, including port. Sample: http://123.456.789.123:8080
    • <CREATE WM ADMIN USERNAME/PASSWORD> create the primary admin account
    • <EMAIL ADDRESS TO SEND HELP EMAILS FROM>: your server should be configured to send emails from this address. This will send users password reset emails and the like.
    • <HELP REQUESTS ARE SENT HERE>: can be same address as above, this is where input from the InterMine help form gets sent.

Get production database

Build a new production database

The build requires:

  • Fasta
  • GFF3
  • GO ontology
  • GO association
  • Ace XML files

All of these are retrieved from the FTP site, except for the Ace XML files.

Generate data files

These must be created on a machine with a tace instance, and stored in an accessible location.

Generate Ace XML files

On machine with Ace instance.

  • Download the website-intermine repository as described above
  • Navigate to acedb-dev/acedb
website-intermine]$ cd acedb-dev/acedb/

imdump.sh is a shell script which generates XML files for each species in model, into the supplied destination directory. The intermine machine must have access to the directory.

  • Here all ace class XML files are being loaded into /nfs/wormbase/wormmine/acedb_dumps/WS239/
acedb]$ ./imdump.sh /nfs/wormbase/wormmine/acedb_dumps/WS239
Species
... done.
Gene
... done.
... and so on

If the directory passed into imdump.sh contains a trailing backslash, script will not function correctly.

Acquire and pre-process data files

On the InterMine machine, in the intermine directory:

  • Navigate to the redeployment folder.
intermine]$ cd redeployment/
  • The update.properties file should contain these two entries:
release = WS239
ace-xml-dir = /nfs/wormbase/wormmine/acedb_dumps/${release}

The release is used to generate strings, and must match the format of the FTP site. ace-xml-dir is where the build looks for the ace xml files generated above.

The build downloads and processes fasta, gff3, go, gaf files. In addition to copying and processing the Ace XML files.

Build configuration
Property function
datadir The data directory for WormMine
release WormBase release version to use for paths and filenames
backup-dirname directory to backup old data directory too.
genomic-fasta-species-file species to download and/or process genomic fasta for
protein-fasta-species-file species to download and/or process protein fasta for
gff3-species-file same, for gff3s
ace-classes file ace classes to copy and/or process

ant -p will display all invokable tasks, available for individual execution.

  • Run the build

This will backup the old data directory into backup-dirname, delete it, then download and process all file types.

redeployment]$ ant

To only download and process:

redeployment]$ ant run-all

Build the database

  • From the intermine directory, navigate to wormmine
idev]$ cd wormmine
  • Run build
idev]$ ../bio/scripts/project_build -b -v localhost wormmine_dump

This will run all sources configured in intermine/wormmine/project.xml file. To learn more about the project.xml file, refer to the official documentation

Any rare issues encountered can be addressed by the InterMine developer list at dev (AT) intermine.org

About the database

The database name is configured in the properties file as: db.production.datasource.databaseName. Each table represents a class in the model, with additional ones representing many-to-many collections, and various metadata. The InterMine development team does not currently advise for developers to modify the backend database due to many layers of inheritance, although questions may be directed to the InterMine developer list at dev (AT) intermine.org.

Instantiate database dump

To instantiate a previously built WormMine production database.

  • Find your favorite release from WORMMINE DB FTP URL (placeholder, no URL exists)
  • Create empty DB
> createdb -U intermine -E SQL_ASCII wormmine
    • -U: user set to intermine
    • -E: character set used
  • Unpack and restore DB
> psql -U intermine -d wormmine -f <WORMMINE RELEASE SQL>
    • -U: execute as user
    • -d: destination DB
    • -f: SQL input file

Create userprofile database

InterMine needs a separate database to track users and their information.

  • Create empty DB
> createdb -U intermine -E SQL_ASCII userprofile-wormmine
  • Build the userprofile DB
> cd wormmine/webapp
> ant build-db-userprofile

This formats the empty userprofile database for mine use.

About the userprofile database

The database name is set in the properties file as db.userprofile-production.datasource.databaseName. User information is stored in the userprofile table. Tables that begin with "saved" map users to any data they have saved; such as lists, queries, templates, and so on. List data mapping is stores in bagvalues

Launch webapp

  • Navigate to intermine/wormmine/webapp
  • Launch webapp:
 > ./xx

This script contains:

ant clean
ant -v default remove-webapp release-webapp

Which may be run in sequence instead. These commands clear previous webapp files, remove any existing webapps which may be launched, and compile and release a new webapp.

Test Webapp

You should be able to reach your new instance through <baseurl>/wormmine Webapp is standalone.

About the webapp

Webapp maintenance is fairly simple. Packaged monitoring services are not provided, and logs are stored in intermine/wormmine/webapp/intermine.log and <$CATALINA_HOME>/logs. Any problems which arise may be handled by rebooting the web application:

$CATALINA_HOME/bin/startup.sh
$CATALINA_HOME/bin/shutdown.sh

Attach to WormBase instance

If you want to enable integration with WormBase, follow these steps:

Checkout merged branch

> git checkout remotes/origin/staging
Note: checking out 'remotes/origin/staging'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 611b791... new tests, shell script to run tests

> git checkout -b staging
Switched to a new branch 'staging'

Reconfigure properties file

  • Needs to deploy at tools/wormmine
webapp.path=tools/wormmine
  • Change url to where the base will be:
webapp.baseurl=http://staging.wormbase.org
webapp.returnurl=http://staging.wormbase.org/auth/openid?openid_identifier=https://www.google.com/accounts/o8/id&redirect=http://dev.wormbase.org/tools/wormmine/mymine.do#

Modify wormbase.conf

To enable login system, make sure config flag: wormmine_path = 'tools/wormmine' is uncommented.