Website:JBrowse set-up on dev.wormbase.org

From WormBaseWiki
Jump to navigationJump to search

These are the directions for setting up a new instance of JBrowse on dev.wormbase.org, which is where the current production version of JBrowse is hosted.

Set up JBrowse

Get the current JBrowse release, put it somewhere with a fair amount of disk space (current build is about 45GB). This directory will be JBrowse is served from, so it it will need to be made accessible to Apache when the set up is complete. The actual setup of JBrowse is quite easy; in the directory where the JBrowse release has been unzipped, do this:

 ./setup.sh

without root privileges. The setup script will use local::lib to install any needed perl prerequisites as well as format some sample data (which can be deleted later from the sample_data directory).

Set up build configuration

From a fresh checkout of the website-admin repository, copy the build config file, website-admin/jbrowse/conf/c_elegans.jbrowse.conf, to a temporary working directory. This directory is where GFF files will be copied, parsed and the JBrowse script flatfile-to-json.pl will be run. While the config file is named "c_elegans..." it is actually organism agnostic and will be used for every organism. c_elegans.jbrowse.conf is a fairly simple ini-style config file, with a general section at the top which you will need to edit, and sections for each track below, which generally you will not need to edit unless a new data type is added that no organism in JBrowse as accessed before (for example, if a track is already configured for C. elegans but is added for C. briggsae, you won't have to do anything in the config file for the build script to automatically include that track from C. briggsae).

This is a sample from the top of the c_elegans.jbrowse.conf:

 release=246
 filedir=/usr/local/ftp/pub/wormbase/releases/
 nosplitgff = 1
 usenice=1
 skipprepare=0
 jbrowsedir=/home/scain/scain/jbrowse-test
 allstats=/usr/local/wormbase/website/tharris/conf/gbrowse/releases/WS246/ALL_SPECIES.stats
 includes=/home/scain/scain/website-admin/jbrowse/jbrowse/data/c_elegans/includes
 functions=/home/scain/scain/website-admin/jbrowse/jbrowse/data/functions.conf
 organisms=/home/scain/scain/website-admin/jbrowse/jbrowse/data/organisms.conf
 glyphs=/home/scain/scain/website-admin/jbrowse/jbrowse/src/JBrowse/View/FeatureGlyph

Most of these items you won't have to change often or at all. The primary things you need to change is the "release" entry and the "allstats" entry (though the need for the allstats entry may go away, as the path to it should be inferable from the release number). The paths to items in the website-admin directory could be updated to put to your current checkout to insure that it is the most currently available configuration. The other items probably won't need to be changed ever but here's an explanation of the non-obvious ones:

  • nosplit - (not currently used) Some of the formatting steps can take a very long time with large GFF files, so one of the intermeditate steps the script will take is to split the GFF file into multiple files based on the reference sequence. This may be less desirable when the genome consists of 10,000 contigs than when there are 6 chromosomes.
  • usenice - Run all of the commands with the Unix nice command to bump their priority down in the command scheduler.
  • skipprepare - Skips the running of the prepare-refseqs.pl JBrowse script. Generally there isn't much point in turning this on, even if you are rerunning the build in a directory where prepare-refseqs.pl has already been run for a given data set, as it is much faster than formatting track data.
  • jbrowsedir - Path to the JBrowse directory that data will be served from.

Track specific configurations

Create a multi-species build script

Create a "multi-species" build shell script and run (best done in a screen process, and/or grab stdout and stderr to a file); it'll take about a day)

Update apache config to point at new build.