Updating WormBase

From WormBaseWiki
Jump to navigationJump to search

Document conventions

Commands that require super-user privileges are prefaced with "sudo" or a "$" prompt. If the system is configured correctly, you should not need to be a root user in order to update the site.

*** Potential stumbling blocks are indented and hilighted with a
 *** preceeding triple asterisk.  Yikes!
 

Schedule

WormBase is updated with each new build of the database released from Sanger. The live site and the development sites are updated on the same day. A typical cycle looks like this:

Fri Jan 1  - WS100 released
           - live site updated to WS99
           - development site updated to WS100
Fri Jan 15 - WS101 released
           - live site updated to WS100
           - development site updated to WS101

Although almost entirely scripted, the update process may require some hands-on monitoring or troubleshooting. This is especially true during cycles where the data model changes significantly.

Quick Start Guide

Updating the production nodes

The steps to updating the site should be executed in the following order:

1. Check out code changes into the "production" directory

brie3> cd /usr/local/wormbase-production
brie3> cvs -n update  // resolve any conflicts and migrate changes as needed

2. Analyze logs

fe> /usr/local/wormbase-admin/log_maintenance/analysis/concatenate_logs.sh [OLD VERSION] www.wormbase.org
brie6> /usr/local/wormbase-admin/log_maintenance/analysis/analyze_logs.sh

3. Reset the squid cache on fe.wormbase.org

fe> sudo /etc/rc.d/init.d/squid fullreset

4. Update the production servers

The live servers will be updated automatically by the wormbase update script wb_update_wormbase.pl.

Run the following command on brie6, aceserver, blast, vab, and gene. The configuration file is found automatically, stored in /usr/local/wormbase-admin/update_scripts/conf based on the name of the machine. You can also specify the path to a config file explicitly using the --config option. Make sure there is adequate disk space before running this command!

brie6> /usr/local/bin/wb_update_wormbase.pl > /usr/local/wormbase/logs/update.log

Restart servers

brie6> sudo kill -9 `ps -C sgifaceserver -o pid=`
brie6> sudo /etc/rc.d/init.d/xinetd restart
brie6> sudo /etc/rc.d/init.d/mysql restart
brie6> restart_web

5. Send out a release notification

brie6> /usr/local/wormbase-admin/update_scripts/send_release_notification.pl

6. Make sure the ontology sockets are working:

 vab> sudo /usr/local/wormbase/cgi-perl/ontology/browser_lib/launch_ontology_sockets.sh {WSVERSION}
 gene> "

Updating the development server

1. Confirm that there is sufficient disk space

You can delete files from:

/usr/local/wormbase/databases
~ftp/pub/wormbase/acedb
~ftp/pub/wormbase/genomes/elegans/genome_feature_tables/GFF2

2. Update the development server

 brie3> /usr/local/wormbase/update_scripts/update_wormbase-WRAPPER.sh [new_WSXXX] CB25


2. Run the "pipeline_addendums.sh" script. This collects a few things together for simplicity

/usr/local/wormbase-admin/update_scripts/pipeline_addendums.sh OLDVERSION NEWVERSION

A. CVS tag/branch the software

brie3> cd /usr/local/wormbase-production
brie3> cvs tag WSXXX (The version of the current development release)
brie3> cvs tag -r WSXXX -F current_release


(This needs to be handled manually) If this is a frozen release, create a branch, too:

brie3> cvs tag -b -r WS160 WS160-branch


B. Create a software release

brie3> /usr/local/wormbase-admin/update_scripts/create_software_release.pl WSXXX


 Handles the following:
 Installing ontology databases (/usr/local/wormbase/cgi-perl/ontology/browser_lib/install.sh [NEW_VERSION])


C. Update the strain list database

brie3> /usr/local/wormbase/bin/strain_search/UPDATE_STRAINS \
           /usr/local/wormbase/html/databases/strain_search \
           /usr/local/acedb/elegans_WSXXX



3. Build the autocomplete database

cd /usr/local/wormbase/update_scripts/autocomplete/meta
./create_autocomplete_db.sh [NEW_VERSION] [MYSQL_USER] [MYSQL_PASS]
./do_load_of_db.sh [WSVERSION] [/path/to/acedb]


- Update the standard_urls.xml file in brie3:/usr/local/wormbase/html/standard_urls and commit


- Update the wiki release notes


- Make sure the ontology browser is running


4. Build the new genetic map database and add some content to the GFF database.

brie3> /usr/local/wormbase/temporary_patches/current.sh [new WS version]


 - calculates genetic intervals
 - calculates genomic coordinates of protein motifs
 - creates a new genetic map
 - loads some temporary annotations


5. Package databases

brie3> /usr/local/wormbase-admin/update_scripts/package_databases.pl


Each step is described in more detail below.

 

Detailed guide

THIS SECTION HAS NOT YET BEEN WIKIFIED

=head3 1. Test problematic URLs

In order to ensure that the site is ready to be updated, visually confirm that the test URLs are displayed properly. A list of these URLs -- along with descriptions of specific items to check -- is maintained at:

http://www.wormbase.org/db/private/test_urls

The username and password is the same as that for the mailing list archive.

=head3 2. Check out code changes into the "production" directory

Instead of maintaining separate CVS repositories on each of the back end servers, a single production-grade version of the software is maintained on the development server at:

/usr/local/wormbase-production

Migrate any changes made during the development period into this repository by:

todd@brie3> cd /usr/local/wormbase-production
todd@brie3> cvs update
*** Be certain to resolve any cvs conflicts and recommit them!
 

Changes will be delivered to the production servers during the next automatic update.

See Appendix 1, "Migrating software changes onto the production servers" for additional details.

=head3 3. CVS tag/branch the software

After updating the production repository, appropriate cvs tags and branches should be created to enable retrieval of a specific revision.

For all releases, create a tag on the main trunk corresponding to the current (or soon to be current) live release. For example, if WS99 has just been migrated to the live site, execute the following commands on the *live* site:

todd@brie3> cd /usr/local/wormbase-production
todd@brie3> cvs tag WS99

For genome freeze releases, create a branch so that we can continue to make bug fixes on the freeze sites. The following command will create a branch at the tag you just created for the WS100 release:

todd@brie3> cvs tag -b -r WS100 WS100-branch

The current_release tag always points at the current version of the software and is used to point to the most current (and hopefully) stable version of the software. Reset it to point at the newest software tag by:

Finally, reset the current_release tag to point at the current tagged release:

todd@brie3> cvs tag -r WS99 -F current_release

See http://www.wormbase.org/docs/SOPS/cvs-usage.html for additional information.

Note: For WS120, the branch/tag names were inversed. Since development had already begun on the WS120 branch, I am unable to rename it.

  WS120                           (branch: 1.2.2)
  WS120-branch_point              (revision: 1.2)

=head3 4. Create a software release

Software releases are created using the just-created tag on the main software trunk. On the development server, issue the following command.

% /usr/local/wormbase/update_scripts/create_software_release.pl [WSXXX]

This will export a copy of the repository at that tag to the software archive on the FTP site omitting the CVS directories. Substitute the WS release just migrated to the live site for [TAG].

=head3 5. Update the development server

The development site is currently updated by /update_scripts/update_wormbase-dev.pl which reads a configuration file (update_wormbase-dev.cfg). This script will perform the following tasks:

For C. elegans:
1.  Check for a new release on the Sanger FTP site
2.  Mirror new release to /usr/local/ftp/pub/wormbase/acedb/WSXXX
3.  Untar the release to /usr/local/acedb/elegans_WSXXX
4.  Copy acedb configuration files to the new acedb database
5.  Create nucleotide blast tables at /blast/blast_WSXXX
6.  Create peptide blast tables at /blast/blast_WSXXX
7.  Create EST blast tables at /blast/blast_WSXXX
8.  Load the C. elegans GFF database
9. Load the C. elegans pmap GFF database
10. Load the PCR/Oligo database
11. Create a variety of feature dumps for the FTP site
12. Build the blat database
13. Restart servers
For C. briggsae
1.  Check for a new release on the wormbase FTP site
2.  Mirror briggsae data from dev.wormbase.org to
       /usr/local/ftp/pub/wormbase/genomes/briggsae/CBXX
3.  Create nucleotide blast tables at /blast/blast_WSXXX
4.  Create peptide blast tables at /blast/blast_WSXXX
5.  Create EST blast tables at /blast/blast_WSXXX
6.  Build the blat database
7.  Load the C. briggsae GFF database (off by default)
8.  Restart servers

For the C. briggsae genome, which is not undergoing systematic revision at this time, a collection of precomputed files necessary for loading the GFF databases are stored on dev.wormbase.org. These files are mirrored to /usr/local/wormbase/mirrored_data/briggsae/CB25.

Please see the pod documentation contained in update_wormbase-dev.pl for additional information.

A simple shell wrapper script is provided that avoids the rather lengthy incantation of the update_wormbase-dev.pl script (as well as creating a directory to store the update logs).

todd@brie3> update_wormbase-dev-WRAPPERS.sh [new_WSXXX] CB25

=head3 6. Package databases

Once the new version of the database is on the development server, create tarballs of the databases. These packages are used by mirror installs, saavy users, and standalone packages.

todd@brie3> /usr/local/wormbase/util/package/package_databases.pl

This will create tarballs of the current acedb database, the briggsae and elegans GFF databases, and the blast and blat databases, placing them in ~ftp/pub/wormbase/database_tarballs/WSXXX/. Once packaged, two symlinks are adjusted:

   live_version -> [the previous development version]
   development_version -> [the newly installed version]

The "live_version" tarballs will automatically be pushed onto the various satellite servers.

=head3 7. Update the production servers

The Bio::GMOD script gmod_update_installatio-wormbase.pl automatically handles updates of the various productions servers. This script:

1. Reads the symlink on the development ftp site:
   ftp://dev.wormbase.org/pub/wormbase/mirror/database_tarballs/live_version
2. If the live_version is newer than the current version, databases
   are downloaded and installed.
3. Software is rsync'ed to the checked out production repository on
   brie3: /usr/local/wormbase-production.
4. Servers are restarted as necessary

=head3 8. Analyze logs

Primary logging at WormBase is now handled by squid on fe.wormbase.org. Squid creates pseudo-httpd style logs with a few minor additions. The httpd logs contain only pages that are served directly by the backend servers. These can safely be ignored as all information is contained in the squid logs.

Begin the log analysis by:

todd@brie6> /usr/local/wormbase/util/log_analysis/analyze_logs.sh WSXXX

This script does a number of things, including copying squid logs from fe.wormbase.org to the localhost, concatenating access and error logs to a single file, doing hostname lookups, and generating reports for the current release, year-to-date, and all accesses in total. Reports of the log analysis end up in html/stats/WSXXX. The concatenated logs will be in logs/archive/raw/access_log.WSXXX.

This script can take some time to run. It should probably be placed in the background and removed from shell control:

todd@brie6> disown -h %[job]

There is one final step that needs to be done in order to make the release stats available to the wormbase-dev community. Edit the file at html/stats/RELEASES with the appropriate information. The release stats will then appear on the private cgi at

http://www.wormbase.org/db/private/release_stats

Finally, restart apache to create new access_logs

todd@brie6> sudo /usr/local/apache/bin/apachectl restart

=head3 9. Reset the squid cache on fe.wormbase.org

As the final step of the update process, the squid cache on fe.wormbase.org must be reset so that cached pages from the old release are not served inadvertently. Ideally, this should occur

  • after* the the production servers have been updated.
*** IMPORTANT NOTE: Resetting the squid cache requires you to
 *** temporarily stop and then start the squid server.  This means
 *** that the site will briefly be inaccesible (akin to stopping
 *** apache).  If something goes wrong and you cannot restart squid,
 *** the site WILL BE DOWN.  See the "Troubleshooting" section for
 *** details.
 

Reset the squid cache using the initd script:

*** Be sure that you have already done the log analysis!
 *** This command deletes the existing squid logs
 
todd@fe> sudo /etc/rc.d/init.d/squid fullreset

This command will reset the squid cache as well as purge the httpd style logs.

Administration

THIS SECTION HAS NOT YET BEEN WIKIFIED AND MAY BE OUT-OF-DATE

=head2 Managing squid

=head3 Controlling the squid server

Test if the squid server is running:

fe> sudo /etc/rc.d/init.d/squid status

Start squid:

fe> sudo /etc/rc.d/init.d/squid start

Stop squid:

fe> sudo /etc/rc.d/init.d/squid stop

Reset the squid cache. This option will stop squid, reset the swap.state file, and restart squid. The end result is squid started anew with an emptyy cache.

fe> sudo /etc/rc.d/init.d/squid resetcache

Delete the squid httpd-style access_logs. Like reset cache, this option takes the server down briefly in order to rotate the squid logs.

fe> sudo /etc/rc.d/init.d/squid deletelogs

Reparse the squid configuration file (not necessary to bring the server down):

fe> sudo /etc/rc.d/init.d/squid reload

=head3 Monitoring squid

The installation of squid offers two options for monitoring the server.

TODO

How to install this stuff

Compile apache with the expires module:

% cd apache_1.3.3. % ./configure --add-module=src/modules/standard/mod_expires.c Download RRDtool and unpack it.

Put poll.pl, create.sh, 1day-cgi, and htaccess somewhere under your htdocs directory.

Rename htaccess to .htaccess and make sure it is readable by all.

Rename 1day-cgi to 1day.cgi. [sorry, I can't name it .cgi here, or else my server tries to execute it].

Run the create.sh script to create the RRD databases.

Try the poll.pl script by hand:

% perl poll.pl localhost Add it to your cron jobs:

% crontab

  • /5 * * * * /usr/local/apache/share/htdocs/squid-rrd/poll.pl localhost

^D Tell apache to interpret .cgi as a CGI script by editing srm.conf and uncommenting this line:

AddHandler cgi-script .cgi Tell apache its okay to execute CGI scripts, and to set Expires headers by adding this to access.conf:

<Directory "/usr/local/apache/share/htdocs/squid-rrd"> AllowOverride Indexes Options ExecCGI <Directory> Apache needs write access to the directory where RRD creates the PNG files. For now, this is the same directory.

% cd /usr/local/apache/share/htdocs/squid-rrd % su

  1. chgrp nobody .
  2. chmod 775 .

Alternatively, you can create empty PNG files and make just those files writable by nobody:

% cd /usr/local/apache/share/htdocs/squid-rrd % su

  1. cp /dev/null connections.day.png ...
  2. chgrp nobody *.png
  3. chmod 775 *.png

To use

Request 1day.cgi in your browser. You should see a number of graphs and your browser should refresh the display every 2.5 minutes.

You may want to copy 1day.cgi to another filename, such as ``1week.cgi and change all occurances of ``day to ``week in the new file.


=head3 Squid tips and tricks

=over 4

=item squid PID

The squid PID file is located at fe:/usr/local/squid/var/logs/squid.pid.

=item squid_start

When squid launches, it first looks for the squid_start script adjacent to the squid binary. This is a useful place for storing additional administration tasks.

=item Squid restarts with write-errors

If log files grow to large and squid can no longer write to them, it will restart.

=item Purging objects from the cache

Occasionally, a corrupt file may be sstored in the cache (ie if a script has a bug that does not generate a server error).

To purge an object (ie URL) from the cache, use the squidclient binary.

*** You must be "on localhost" -- logged in to fe.wormbase.org to use
 *** this command.
 
 /usr/local/squid/bin/squidclient -p [port] -m PURGE [url]
ie:
 squidclient -p 80 -h fe.wormbase.org -m PURGE \
   http://www.wormbase.org/db/gene/gene?name=unc-26

=back

=head2 Log files

The various servers at WormBase create a number of different logs.

=head3 Squid

Squid creates a large number of logs but only two are useful on a day-to-day basis.

=over 4

=item /usr/local/squid/var/logs/cache.log

This is the primary source of information on the health of the squid server. This log is also echoed to the system log /var/log/messages. This log grows very slowly if squid is running well; if squid is sick, watch for a tremendous growth in the size of this log!

=item httpd-style access logs

Using the configuration described above, squid creates httpd-style logs at /usr/local/squid/logs/access_log. This logs are akin to the httpd logs found in /usr/local/wormbase/logs/access_log. They should be used for the primary log analysis.

*** Note that these logs are very similar to httpd access_logs with
 *** the addition of a single column. This column records the squid
 *** response and hierarchy code, showing how the request was handled.
 

=back

=head3 Acedb

Acedb produces two logs. These are both located in the "database" folder of the current database.

*** Both of these logs MUST be writable by the acedb user! If they
 *** aren't, the database will not be able to start.  This is not
 *** always obvious - look for cycling xinetd requests attempting to
 *** launch the database in /var/log/messages.
 

=over 4

=item log.wrm

This slowly growing log contains information on the status of the server.

=item serverlog.wrm

This quickly growing log records queries to the database. It is really only useful when trying to debug slow queries.

*** If serverlog.wrm reaches 2 GB in size, acedb may not be able to
 *** write to the file. This causes the database to crash, and xinetd
 *** to fail in launching it! Don't let this happen (see below)!
 

=back

=head3 Apache

Apache creates an error log and access log that can be used for debugging and watching direct requests to back end origin servers.

=over 4

=item /usr/local/wormbase/logs/access_log

A record of all direct requests to the httpd origin server.

*** Since squid intercepts all requests -- and serves many directly
 *** from the cache -- the httpd access_log is not a true indicator of
 *** access statistics.
 

=item /usr/local/wormbase/logs/error_log

Errors encountered during execution of requests at the origin server.

=back

Cron jobs

This section describes cron jobs that are used to keep WormBase running smoothly.

*** Some jobs are specific to certain machines. The appropriate
 *** machine is indicated in parenthesis when appropriate.
 
*** All cron jobs shown here should be entered in the root crontab
 *** unless otherwise indicated.
 

Log rotation

The following cron entries keep the log files in check.

squid (fe.wormbase.org)

# Rotate the squid httpd-style access logs once per day at 2 AM
 # This cron job is only required on fe.wormbase.org
 0 2 * * * /usr/local/wormbase-admin/log_maintenance/rotation/rotate_squid_logs.pl
 

acedb (unc, vab, crestone)

The Acedb serverlog.wrm grows to epic proportions and rarely contains useful (ie on a day-to-day basis) information. And when the logs grow very large, even log rotation can become painfully slow and even crash the server. Instead, we purge the logs by writing a single bit to the file with the following cron job.

TODO: frequency?
0 * * * * /usr/local/wormbase-admin/log_maintenance/rotation/purge_epic_acedb_logs.pl

Synchronization

Maintain software rsync modules (brie3)

Synchronize release notes, MT, and index.html with the rsync target

This cron job ensures that various files created on the production server brie6 are kept in sync with the rsync production code target module. This ensures that all sites that sync with the production target have the same files as those located on brie6. This is, admittedly, a little strange since we sync files from brie6 to brie3, but then also sync the same files back as well.

0 1 * * * /usr/local/wormbase-admin/update_scrips/rsyn_misc.pl

Troubleshooting / when things go wrong

=over 4

=item Symptom: hardware crash of fe.wormbase.org

In the event that fe.wormbase.org suffers a catastrophic hardware failure, DNS entries can be modified to point www -> unc.wormbase.org. Once this change percolates through the system, WormBase will be restored.

=item Symptom: periodic success/failures of a single page

Because of the distributed nature of the WormBase infrastructure, troubleshooting problems is more complicated than a single-server installation. In particular, if a single requests succeeds and fails with about similar frequency, one of the origin servers may be down.

Check the following things in this order:

=over 4

=item 1. Is Acedb running and responsive on the origin servers?

TODO:

=item 2. Is Mysqld running and responsinve on the origin servers?

TODO:

=item 3. Is httpd running and responsive on the origin servers?

Check that httpd is running on all back-end origin servers. If it is not, ensure that there is sufficient disk space for logging and restart apache by:

brie6> sudo /usr/local/apache/bin/apachectl start

=item 4. Is squid running and responsive on fe.wormbase.org?

The most unlikely event is that squid itself may have crashed on the origin server. You can check this by:

fe> sudo /etc/rc.d/init.d/squid status

If squid is down, ensure that there is sufficient disk space and that the log directories are writable by the squid user (if these conditions are not met, squid will exit and send errors to /usr/local/squid/var/logs/cache.log). Restart squid by,

fe> sudo /etc/rc.d/init.d/squid start

=back

=back

Appendices

Appendix 1: Migrating software changes onto the production servers

Previously, WormBase relied on CVS to migrate software changes into the "live" environment on www.wormbase.org. This approach worked well for a single-server environment but presents difficulties when used in a multi-server, load balancing and caching configuration:


Individual CVS checkouts must be done on each server

The software must be updated on each server manually. Although this process can be scripted, CVS conflicts can still result.

CVS checkouts can result in site-breaking CVS conflicts

Each conflict must be dealt with in a tedious and time-consuming manner. Occassionally -- especially during updates with multiple conflicts -- legitimate changes can be inadvertently overwritten.

Expedience of updates vs rigorous testing

The relative ease of updates encourages patching and development on production code instead of more rigorous testing during development. This increases the instability of the site.

Proxy cache may become polluted during the update process

Given the time lag required to update each server, a cache may become polluted with pages loaded from one server running an older version of the database or software.

To resolve these problems, WormBase uses a "push" method to deliver updated software to each production node. This requires updating only a single server which will then push changes onto all production servers.

To migrate changes onto the live servers, first develop and test them either in your own private checked out source or on brie3:/usr/local/wormbase. When testing is complete, commit your changes, and then check them out into brie3:/usr/local/wormbase-production. This directory acts as an rsync module. Once checked out, your changes will automatically migrate to the production nodes.

To migrate changes onto the live servers, check them out via CVS into the "production" version of the source on the development site:

  brie3> cd brie3:/usr/local/wormbase-production
  brie3> cvs co // Be sure to resolve and recommit any conflicts!

These changes will automatically migrate to each production node during the next scheduled update (currently every four hours). It maky take longer for changes to appear as old pages expire from the caches.

Important notes: Third party software packages currently in use at WormBase still need to be updated on each node when necessary.

=head2 Appendix 2: FTP site structure

During the update process, files are mirrored to and create directly on the FTP site. The FTP site has the following stucture:

drwxrwxr-x  briggsae
lrwxrwxr-x  briggsae-current_release -> briggsae/CB25
drwxrwxr-x  database_tarballs
drwxrwxr-x  elegans
lrwxrwxr-x  elegans-current_release -> elegans/WS121

The briggsae-current_release and elegans-current_release symlinks point to the current releases for both species.

The database_tarballs directory contains tarred and gzipped files of the preconstructed databases for each release. These are created automatically just prior to updating of the live site and are used by individuals installing WormBase from packages.

Each release of the database is stored in its entirety, directly mirrored from the Sanger FTP site:

% ls ~ftp/pub/wormbase/elegans
drwxrwxr-x    4 todd     wormbase      136 Feb 24 06:14 WS119
drwxrwxr-x    4 todd     wormbase      136 Mar 03 12:38 WS120
drwxrwxr-x    4 todd     wormbase      136 Mar 18 08:22 WS121

All symlinks are updated automatically by the update process.

Commonly accessed files can be found at:

elegans-current_release/wormpep
elegans-current_release/wormrna
elegans-current_release/confirmed_genes
elegans-current_release/best_blastp_hits
elegans-current_release/best_blastp_hits_brigpep
elegans-current_release/GENE_DUMPS
elegans-current_release/DNA_DUMPS
elegans-current_release/FEATURE_DUMPS

The wormpep, wormrna, and confirmed_genes point to oft-used files for the current WSXXX release.

The GENE_DUMPS and DNA_DUMPS directories contain the *.gff.gz and

  • .dna files, respectively, used during the GFF load process.

=head2 Appendix 3: Squid response and hierarchy codes.

=head3 Response codes

The following codes are appended to the squid-generated, httpd style access logs. The TCP_ codes refer to requests on the HTTP port (usually 3128). The UDP_ codes refer to requests on the ICP port (usually 3130) and do not apply to the current WormBase configuration. These codes (in conjunction with the hierarchy codes listed below) describe how the cache handled the request.

=over 4

=item TCP_HIT

A valid copy of the requested object was in the cache.

=item TCP_MISS

The requested object was not in the cache.

=item TCP_REFRESH_HIT

The requested object was cached but STALE. The IMS query for the object resulted in "304 not modified".

=item TCP_REF_FAIL_HIT

The requested object was cached but STALE. The IMS query failed and the stale object was delivered.

=item TCP_REFRESH_MISS

The requested object was cached but STALE. The IMS query returned the new content.

=item TCP_CLIENT_REFRESH_MISS

The client issued a "no-cache" pragma, or some analogous cache control command along with the request. Thus, the cache has to refetch the object.

=item TCP_IMS_HIT

The client issued an IMS request for an object which was in the cache and fresh.

=item TCP_SWAPFAIL_MISS

The object was believed to be in the cache, but could not be accessed.

=item TCP_NEGATIVE_HIT

Request for a negatively cached object, e.g. "404 not found", for which the cache believes to know that it is inaccessible. Also refer to the explainations for negative_ttl in your squid.conf file.

=item TCP_MEM_HIT

A valid copy of the requested object was in the cache and it was in memory, thus avoiding disk accesses.

=item TCP_DENIED

Access was denied for this request.

=item TCP_OFFLINE_HIT

The requested object was retrieved from the cache during offline mode. The offline mode never validates any object, see offline_mode in squid.conf file.

=item UDP_HIT

A valid copy of the requested object was in the cache.

=item UDP_MISS

The requested object is not in this cache.

=item UDP_DENIED

Access was denied for this request.

=item UDP_INVALID

An invalid request was received.

=item UDP_MISS_NOFETCH

During "-Y" startup, or during frequent failures, a cache in hit only mode will return either UDP_HIT or this code. Neighbours will thus only fetch hits.

=item NONE

Seen with errors and cachemgr requests.

=back

=head3 Hierarchy codes

The following hierarchy codes are used with Squid-2. They convey information about how the request was handled.

=over 4

=item NONE

For TCP HIT, TCP failures, cachemgr requests and all UDP requests, there is no hierarchy information.

=item DIRECT

The object was fetched from the origin server.

=item SIBLING_HIT The object was fetched from a sibling cache which replied with UDP_HIT.

=item PARENT_HIT

The object was requested from a parent cache which replied with UDP_HIT.

=item DEFAULT_PARENT

No ICP queries were sent. This parent was chosen because it was marked ``default in the config file.

=item SINGLE_PARENT

The object was requested from the only parent appropriate for the given URL.

=item FIRST_UP_PARENT

The object was fetched from the first parent in the list of parents.

=item NO_PARENT_DIRECT

The object was fetched from the origin server, because no parents existed for the given URL.

=item FIRST_PARENT_MISS

The object was fetched from the parent with the fastest (possibly weighted) round trip time.

=item CLOSEST_PARENT_MISS

This parent was chosen, because it included the the lowest RTT measurement to the origin server. See also the closests-only peer configuration option.

=item CLOSEST_PARENT

The parent selection was based on our own RTT measurements.

=item CLOSEST_DIRECT

Our own RTT measurements returned a shorter time than any parent.

=item NO_DIRECT_FAIL

The object could not be requested because of a firewall configuration, see also never_direct and related material, and no parents were available.

=item SOURCE_FASTEST

The origin site was chosen, because the source ping arrived fastest.

=item ROUNDROBIN_PARENT

No ICP replies were received from any parent. The parent was chosen, because it was marked for round robin in the config file and had the lowest usage count.

=item CACHE_DIGEST_HIT

The peer was chosen, because the cache digest predicted a hit. This option was later replaced in order to distinguish between parents and siblings.

=item CD_PARENT_HIT

The parent was chosen, because the cache digest predicted a hit.

=item CD_SIBLING_HIT

The sibling was chosen, because the cache digest predicted a hit.

=item NO_CACHE_DIGEST_DIRECT

This output seems to be unused?

=item CARP

The peer was selected by CARP.

=item ANY_PARENT

part of src/peer_select.c:hier_strings[].

=item INVALID CODE

part of src/peer_select.c:hier_strings[].

=back

Almost any of these may be preceded by 'TIMEOUT_' if the two-second (default) timeout occurs waiting for all ICP replies to arrive from neighbors, see also the icp_query_timeout configuration option.

=head2 Appendix 4: Glossary

=over 4

=item reverse proxy server

A server that intercepts requests for a primary web server and then does interesting things (such as caching or load balancing of responses). Reverse proxy servers are also referred to as surrogate servers. WormBase uses the open-source proxy server Squid.

=item origin server

A webserver such as apache that resides behind a reverse proxy server.

=item HTTP acceleration

The process of caching web pages generated by an origin server on disk or in memory to accelerate future requests for that resource.

=back

=head2 Appendix 5: TODO

Need to adjust how Bio::GMOD fetches the live/development versions. These should be read from the FTP site. The script should also choke if the versions cannot be read.

Place all update scripts in wormbase-admin


Need: script for purging disk cache

Need: monitoring script for squid and ALL servers - should attempt to restart when necessary

RRD stuff

=head1 Author

Author: Todd Harris (harris@cshl.org)
$Id: updating_wormbase.pod,v 1.16 2005/07/07 18:21:15 todd Exp $
Copyright @ 2004-2005 Cold Spring Harbor Laboratory

=cut