From WormBaseWiki
Revision as of 20:03, 11 August 2010 by Kyook (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Background Information

WormBase contains information about the genomic sequence of C. elegans, its genes and their products, and its higher-level traits such as gene expression patterns and neuronal connectivity. These data are interconnected, so that a search beginning with one object (such as a gene) can be directed to related objects of a different type (such as the DNA sequence of the gene, or the cells in which the gene is active). WormBase contains a list of literature pertinent to C. elegans, comprising over 11,000 published articles, along with abstracts containing prepublication data from annual research meetings. This literature can be searched not only with simple word tags but with the Textpresso software package, which extracts complex subsets of information from over 6,400 research papers (; Müller et al., 2004). Expression patterns reported for any C. elegans gene in the published literature can be found in WormBase, by reference either to genes or to cell or tissue types. There are results from RNAi screens with over 100 discrete phenotypes, inactivating over 3100 distinct genes. The genomic sequence has been aligned to orthologous sites in the genome of C. briggsae, a sibling species of C. elegans (Stein et al., 2003). Finally, there are an increasing number of functional annotations for genes, provided both in free text and as Gene Ontology terms (Harris et al., 2004; UNIT 7.2).

WormBase is aimed at Web use, which favors individual over batch queries. However, with some effort, one can also do searches for complex data sets. ACeDB, the engine of WormBase, can be used for large-scale bioinformatics; some searches allow uploading large files for queries. Searches using the ACeDB Query Language (AQL) are also feasible. Information about how to use WormBase is available through e-mail at and through the on-line WormBase Users' Guide. The WormBase developers group actively invites suggestions for improvement from users, and WormBase's source code is freely available for local installation and improvement.

There is an on-line User's Guide maintained by WormBase curators, available at It contains a full index, a set of answers to frequently asked questions, a downloadable PDF version of the Guide (for easy reading in printed form), and explanations of many individual parts of WormBase. It was written independently of this unit, and is recommended as a complementary source of information about WormBase.

At this writing, sequencing and analysis of the C. remanei genome is still underway. However, a preliminary C. remanei genome sequence assembly, with protein-coding gene predictions, is available at From that site, carry out genome sequence searches in the same way as for C. elegans or C. briggsae Genome Browser pages (Basic Protocols 7-8). In the near future, this should move from the development site to the main site, and be syntenically aligned to the C. elegans and C. briggsae genomes.

WormBase data sources

WormBase is derived from the previous C. elegans database ACeDB (Eeckman and Durbin, 1995). ACeDB was designed to archive and correlate classical genetic maps of chromosomes, clone maps (primarily of cosmids, but also of YACs, phage clones, and cDNAs), and sequences of genomic DNA; it also had telegraphic descriptions of some genes, with references and abstracts from C. elegans literature and meetings. ACeDB was crucial in aligning the genetic and physical maps and in allowing collaboration between the C. elegans genome project and its research community. Designed in the early 1990s, ACeDB required local client-server interactions between several personal computers and one server (typically Unix). The rise of the World Wide Web made this unnecessary, while making it desirable to have a central server present the most up-to-date version of ACeDB through a Web interface. This was accomplished in 1998 with AcePerl (Stein and Thierry-Mieg, 1998). AcePerl and ACeDB became the core of WormBase at its founding in 2000 (Chen et al., 2005), although part of WormBase was later moved to MySQL (; UNIT 9.2). More details of WormBase's design are discussed elsewhere (Schwarz et al., 2002).

The main reason for using WormBase is that it represents the most extensive C. elegans database publically available. It is currently designed to make several different kinds of information available. There are deficiencies in WormBase, however, that must be pointed out. Several data sets still remain to be curated by the WormBase staff and included in WormBase. Functional annotations, while currently covering most named genes, need to be continually corrected, revised, and expanded, both in text and in Gene Ontology form. While results for large-scale RNAi (UNIT 12.3) experiments are generally up to date, RNAi results from individual researchers are still being curated. New data sets (e.g., SAGE and whole-proteome interaction maps; Jones et al., 2001; Li et al., 2004) continue to be generated and to require annotation. While it is possible to do a local installation of WormBase, it is not easy; local installations are complex and error-prone. Finally, while it is possible to extract complex patterns from the data in WormBase, and increasingly straightforward tools such as WormMart exist to support this, such extraction is still sometimes clumsy and ad hoc. The curators of WormBase continually strive to diminish such defects, but users should be on guard for them.

Local versus remote access

One issue to be considered in using WormBase is whether to access it remotely or whether to set up a local installation. The authors' experience is that local installations are preferable if they are on reasonably fast hardware (the "optimal" specifications described in Necessary Resources for Alternate Protocol 1), simply because running the program locally reduces lags due to network overload, sharing of a single server by many users, or temporary bugs in WormBase itself. WormBase currently has two official mirror sites, and welcomes any investigators who would like to run public mirrors of their own, since increasing the number of WormBase sites increases reliability and helps debug flaws in the site software.

Alternatives to WormBase

While there is no exact substitute for WormBase, there are alternatives to it. Users who want a C. elegans database that is easy to install locally and that has some of WormBase's general features can try ACeDB, which can be compiled and run on any Linux/Unix operating system. The source code for ACeDB is available, with documentation, at Another option is to use the C. elegans (or C. briggsae) track of the UCSC Genome Browser (Kent et al., 2002;; UNIT 1.4). The WormGenes database ( has convenient links to the gene expression database of Kohara and coworkers (Tabara et al., 1996). Finally, the fee-only database Proteome (Costanzo et al., 2001) is maintained by Incyte (; this database includes C. elegans genes.

Using mirror sites

The two main Web sites for WormBase, and, are maintained by Lincoln Stein at Cold Spring Harbor Laboratory. While generally very stable, they are not always accessible (because of heavy demand or blockage of the network). If the main sites are unavailable, try the mirrors at California Institute of Technology ( or at the Institute of Molecular Biology and Biotechnology (IMBB) in Crete The development site has newer data releases and software versions; these are put up on the development site for three weeks, to check for defects, before being transferred to the main site.

To find what mirrors are available at any given time, check the WormBase front page ( Hypertext links to any official mirrors are given immediately near the bottom of the front page (

Subscribing to a mailing list

WormBase has E-mail list servers, allowing worldwide participation in the development of WormBase. One of these listservs (wormbase-dev) is a technical list used by official WormBase developers. The others are open to any interested member of the research community. The two listservs most likely to be useful are wormbase-announce and wormbase-help (alias The former is used for general news about WormBase; the latter is an an open forum where anyone can ask questions (or make complaints) about WormBase. Any E-mail to wormbase-help is likely to be read by a WormBase curator within a few hours, and may well be read and replied to within minutes.

To subscribe, simply click on the Mailing Lists section of the WormBase front page's site directory. This will lead to the WormBase Mailing Lists page One can than subscribe to any E-mail listservs that are useful by sending an E-mail to In the body of the E-mail (not the subject line) write subscribe [listserv name]. Some of these listservs are restricted to internal WormBase use.

Asking for help, information, or new features

WormBase is not a static resource: it is maintained by a team of bioinformaticians, programmers, and curators who collectively work to explain its features, help make sense of its search results, correct errors, fix defects, add new features on request, and generally try to keep WormBase improving. Investigators can submit feedback from any Web page in WormBase simply by clicking the "Send comments or questions to WormBase" link in the lower left-hand corner ( This leads to the Feedback page ( Type in comments, questions, or complaints, and send them off by clicking the Submit Comments button. Alternatively, one can send e-mail to

Submitting data to WormBase

Click on the Submit link at the top of the WormBase front page ( This will lead to the on-line data submission form at If in doubt, use the basic feedback link, which is given at the top of the data submission page; it leads to misc/feedback. The basic feedback page is always a good choice. Specific areas of interest are given on the Submit Data page to let individual researchers steer their queries in a specific direction if they want to, but doing so is purely optional.

If there is a specific topic that one is particularly interested in, choose an area of interest on the data submission page. There are several areas that one can choose: sequence and gene structure; the worm proteome; cells and anatomy; gene mutations, map locations, and functions; or nomenclature. To direct information to any one area, click on its link (which will read either "Fill out online form" or "Email Individual Person"). E-mails to individuals go directly to a WormBase curator officially responsible for a given topic.

Suggestions for Further Analysis

Using other web sites relevant to C. elegans

The Resources link in the Links section of the WormBase front page ( leads to From there, one can choose one or more links to different C. elegans projects.

These include three on-line reference works on C. elegans biology, and links to several other databases concerned with C. elegans genetics, genomics, or anatomy. Particularly noteworthy is Leon Avery's Web site at the University of Texas ( This site gives convenient, up-to-date access to the abstracts of C. elegans meetings, the Worm Breeder's Gazette, and news announcements by researchers in academia, government, and industry.

Another important resource is WormAtlas ( The WormAtlas database is intended to provide a comprehensive atlas of C. elegans anatomy, like the atlases that have long been available for human anatomy in medicine. It includes a collection of serial electron microscope sections, and several guides to cellular anatomy, including beautiful schematics of neuronal morphology.

Finally, WormBook ( provides an on-line compendium of over 70 articles about C. elegans biology, covering genetics, genomics, molecular biology, cell biology, sex determination, developmental biology, neurobiology, and evolution.

Other model organism databases

Go to From here, one can reach public databases for several eukaryotic species (such as Saccharomyces cerevisiae, Drosophila melanogaster, Arabidopsis thaliana, and Mus musculus). There are also links to the Gene Ontology (UNIT 7.2), Genome KnowledgeBase, and Generic Model Organism Database projects.

Downloading bulk data

The WormBase front page not only allows a basic search, but also has links to several other search pages ( One of these provides a general collection of data for intensive analysis on one's own computer.

Go to the WormBase Downloads page, which in turn gives access to several large data sets. These sets include genomic features in the General Feature Format (GFF;; full genomic and EST sequences in FASTA format; the set of all protein sequences known or predicted to exist in C. elegans (wormpep), and the set of all known non-protein-coding RNA transcripts (wormrna). There are also sequence and gene prediction data for C. briggsae, allowing comparative studies such as phylogenetic footprinting (Nardone et al., 2004). Extensive documentation for how GFF is used in WormBase is available as part of the documentation for the Generic Genome Browser (Stein et al., 2002;, GBrowse, CONFIGURE_HOWTO.pod). Other data sets include: classical genetic maps; movies of mutant embryos in mass RNAi assays; technical details of standard transgenic vectors; translation tables linking classical gene to genomic sequence names; and EST sequences from other nematode species (such as animal or plant parasites) which may include genes lost from the C. elegans genome (such as the proto-oncogene EMSY; Hughes-Davies et al., 2003).

To obtain a specific data set, click on its hypertext link. Data will be given as a Web page (which can be saved to one's hard drive) or, alternatively, one will be prompted to save a file immediately.

Literature Cited

Ailion, M. and Thomas, J.H. 2003. Isolation and characterization of high-temperature-induced dauer formation mutants in Caenorhabditis elegans. Genetics 165:127-144.

Apweiler, R. 1995. Sequence databases. In Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, 3rd. ed. (A.D. Baxevanis and B.F.F. Ouellette, eds.) pp. 3-24. John Wiley & Sons, Inc., New York.

Ashrafi, K., Chang, F.Y., Watts, J.L., Fraser, A.G., Kamath, R.S., Ahringer, J., and Ruvkun, G. 2003. Genome-wide RNAi analysis of Caenorhabditis elegans fat regulatory genes. Nature 421:268-272.

Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O'Donovan, C., Redaschi, N., and Yeh, L.S. 2005. The Universal Protein Resource (UniProt). Nucleic Acids Res. 33:D154-D159.

Balakrishnan, R., Christie, K.R., Costanzo, M.C., Dolinski, K., Dwight, S.S., Engel, S.R., Fisk, D.G., Hirschman, J.E., Hong, E.L., Nash, R., Oughtred, R., Skrzypek, M., Theesfeld, C.L., Binkley, G., Dong, Q., Lane, C., Sethuraman, A., Weng, S., Botstein, D., and Cherry, J.M. 2005. Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD). Nucleic Acids Res. 33:D374-D377. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. 2004. The Pfam protein families database. Nucleic Acids Res. 32 :D138-D141.

Brenner, S. 1974. The genetics of Caenorhabditis elegans. Genetics 77:71-94. C. elegans Sequencing Consortium. 1998. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282:2012-2018.

Chalfie, M. and White, J. 1988. The nervous system. In The Nematode Caenorhabditis elegans (W.B. Wood., ed.) pp. 337-391. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

Chen, N., Harris, T.W., Antoshechkin, I., Bastiani, C., Bieri, T., Blasiar, D., Bradnam, K., Canaran, P., Chan, J., Chen, C.K., Chen, W.J., Cunningham, F., Davis, P., Kenny, E., Kishore, R., Lawson, D., Lee, R., Müller, H.M., Nakamura, C., Pai, S., Ozersky, P., Petcherski, A., Rogers, A., Sabo, A., Schwarz, E.M., Van Auken, K., Wang, Q., Durbin, R., Spieth, J., Sternberg, P.W., and Stein, L.D. 2005. WormBase: a comprehensive data resource for Caenorhabditis biology and genomics. Nucleic Acids Res. 33:D383-D389.

Cho, S., Jin, S.W., Cohen, A., and Ellis, R.E. 2004. A phylogeny of Caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res. 14 :1207-1220.

Cohn, D.H., Ehtesham, N., Krakow, D., Unger, S., Shanske, A., Reinker, K., Powell, B.R., and Rimoin, D.L. 2003. Mental retardation and abnormal skeletal development (Dyggve-Melchior-Clausen dysplasia) due to mutations in a novel, evolutionarily conserved gene. Am. J. Hum. Genet. 72:419-428.

Costanzo, M.C., Crawford, M.E., Hirschman, J.E., Kranz, J.E., Olsen, P., Robertson, L.S., Skrzypek, M.S., Braun, B.R., Hopkins, K.L., Kondu, P., Lengieza, C., Lew- Smith, J.E., Tillberg, M., and Garrels, J.I. 2001. YPD, PombePD and WormPD: Model organism volumes of the BioKnowledge library, an integrated resource for protein information. Nucleic Acids Res. 29:75-79.

Drysdale, R.A., Crosby, M.A., and FlyBase Consortium. 2005. FlyBase: genes and gene models. Nucleic Acids Res. 33:D390-D395.

Eeckman, F.H. and Durbin, R. 1995. ACeDB and Macace. Methods Cell Biol. 48:583-605.

El Ghouzzi, V., Dagoneau, N., Kinning, E., Thauvin-Robinet, C., Chemaitilly, W., Prost-Squarcioni, C., Al-Gazali, L.I., Verloes, A., Le Merrer, M., Munnich, A., Trembath, R.C., and Cormier-Daire, V. 2003. Mutations in a novel gene Dymeclin (FLJ20071) are responsible for Dyggve-Melchior-Clausen syndrome. Hum. Mol. Genet. 12:357-364.

Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E. and Mello, C.C. 1998. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391:806-811.

Ge, H., Walhout, A.J., and Vidal, M. 2003. Integrating 'omic' information: A bridge between genomics and systems biology. Trends Genet. 19:551-560.

GuhaThakurta, D., Schriefer, L.A., Waterston, R.H., and Stormo, G.D. 2004. Novel transcription regulatory elements in Caenorhabditis elegans muscle genes. Genome Res. 14:2457-2468.

Gunsalus, K.C., Ge, H., Schetter, A.J., Goldberg, D.S., Han, J.D., Hao, T., Berriz, G.F., Bertin, N., Huang, J., Chuang, L.S., Li, N., Mani, R., Hyman, A.A., Sonnichsen, B., Echeverri, C.J., Roth, F.P., Vidal, M., and Piano, F. 2005. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436:861-865.

Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., Richter, J., Rubin, G.M., Blake, J.A., Bult, C., Dolan, M., Drabkin, H., Eppig, J.T., Hill, D.P., Ni, L., Ringwald, M., Balakrishnan, R., Cherry, J.M., Christie, K.R., Costanzo, M.C., Dwight, S.S., Engel, S., Fisk, D.G., Hirschman, J.E., Hong, E.L., Nash, R.S., Sethuraman, A., Theesfeld, C.L., Botstein, D., Dolinski, K., Feierbach, B., Berardini, T., Mundodi, S., Rhee, S.Y., Apweiler, R., Barrell, D., Camon, E., Dimmer, E., Lee, V., Chisholm, R., Gaudet, P., Kibbe, W., Kishore, R., Schwarz, E.M., Sternberg, P., Gwinn, M., Hannick, L., Wortman, J., Berriman, M., Wood, V., de la Cruz, N., Tonellato, P., Jaiswal, P., Seigfried, T., and White, R. 2004. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32: D258-D261.

Hubbard, T., Andrews, D., Caccamo, M., Cameron, G., Chen, Y., Clamp, M., Clarke, L., Coates, G., Cox, T., Cunningham, F., Curwen, V., Cutts, T., Down, T., Durbin, R., Fernandez-Suarez, X.M., Gilbert, J., Hammond, M., Herrero, J., Hotz, H., Howe, K., Iyer, V., Jekosch, K., Kahari, A., Kasprzyk, A., Keefe, D., Keenan, S., Kokocinsci, F., London, D., Longden, I., McVicker, G., Melsopp, C., Meidl, P., Potter, S., Proctor, G., Rae, M., Rios, D., Schuster, M., Searle, S., Severin, J., Slater, G., Smedley, D., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Storey, R., Trevanion, S., Ureta-Vidal, A., Vogel, J., White, S., Woodwark, C., and Birney, E. 2005. Ensembl 2005. Nucleic Acids Res. 33:D447-D453.

Hughes-Davies, L., Huntsman, D., Ruas, M., Fuks, F., Bye, J., Chin, S.F., Milner, J., Brown, L.A., Hsu, F., Gilks, B., Nielsen, T., Schulzer, M., Chia, S., Ragaz, J., Cahn, A., Linger, L., Ozdag, H., Cattaneo, E., Jordanova, E.S., Schuuring, E., Yu, D.S., Venkitaraman, A., Ponder, B., Doherty, A., Aparicio, S., Bentley, D., Theillet, C., Ponting, C.P., Caldas, C., and Kouzarides, T. 2003. EMSY links the BRCA2 pathway to sporadic breast and ovarian cancer. Cell 115:523-535.

Jones, S.J., Riddle, D.L., Pouzyrev, A.T., Velculescu, V.E., Hillier, L., Eddy, S.R., Stricklin, S.L., Baillie, D.L., Waterston, R., and Marra, M.A. 2001. Changes in gene expression associated with developmental arrest and longevity in Caenorhabditis elegans. Genome Res. 11:1346-1352.

Jorgensen, E.M. and Mango, S.E. 2002. The art and design of genetic screens: Caenorhabditis elegans. Nat. Rev. Genet. 3:356-369.

Kamath, R.S., Martinez-Campos, M., Zipperlen, P., Fraser, A.G., and Ahringer, J. 2001. Effectiveness of specific RNA-mediated interference through ingested double-stranded RNA in Caenorhabditis elegans. Genome Biol. 2:RESEARCH0002.

Kamath, R.S., Fraser, A.G., Dong, Y., Poulin, G., Durbin, R., Gotta, M., Kanapin, A., Le Bot, N., Moreno, S., Sohrmann, M., Welchman, D.P., Zipperlen, P., and Ahringer, J. 2003. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 421:231-237.

Kasprzyk, A., Keefe, D., Smedley, D., London, D., Spooner, W., Melsopp, C., Hammond, M., Rocca-Serra, P., Cox, T., and Birney, E. 2004. EnsMart: a generic system for fast and flexible access to biological data. Genome Res. 14:160-169.

Kent, W.J. 2002. BLAT: The BLAST-like alignment tool. Genome Res. 12:656-664.

Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. 2002. The human genome browser at UCSC. Genome Res. 12:996-1006.

Kiontke, K., Gavin, N.P., Raynes, Y., Roehrig, C., Piano, F., and Fitch, D.H. 2004. Caenorhabditis phylogeny predicts convergence of hermaphroditism and extensive intron loss. Proc. Natl. Acad. Sci. U.S.A. 101 :9003-9008.

Korf, I., Yandell, M., and Bedell, J. 2003. BLAST. O'Reilly & Associates, Inc., Sebastopol, Calif.

Krause, M. 1995. Techniques for analyzing transcription and translation. Methods Cell Biol. 48:513-529.

Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305:567-580.

Li, S., Armstrong, C.M., Bertin, N., Ge, H., Milstein, S., Boxem, M., Vidalain, P.-O., Han J.-D.J., Chesneau, A., Hao, T., Goldberg, D.S., Li, N., Martinez, M., Rual, J.-F., Lamesch, P., Xu, L., Tewari, M., Wong, S.L., Zhang, L.V., Berriz, G.F., Jacotot, L., Vaglio, P., Reboul, J., Hirozane-Kishikawa, T., Li, Q., Gabel, H.W., Elewa, A., Baumgartner, B., Rose, D.J., Yu, H., Bosak, S., Sequerra, R., Fraser, A., Mango, S.E., Saxton, W.M., Strome, S., van den Heuvel, S., Piano, F., Vandenhaute, J., Sardet, C., Gerstein, M., Doucette-Stamm, L., Gunsalus, K.C., Harper, J.W., Cusick, M.E., Roth, F.P., Hill, D.E., and Vidal, M. 2004. A map of the interactome network of the metazoan C. elegans. Science 303:540-543.

Lippincott-Schwartz, J. and Patterson, G.H. 2003. Development and use of fluorescent protein markers in living cells. Science 300:87-91.

Lupas, A. 1997. Predicting coiled-coil regions in proteins. Curr. Opin. Struct. Biol. 7:388-393.

Mello, C. and Fire, A. 1995. DNA transformation. Methods Cell Biol. 48:451-482.

Merke, D.P. and Bornstein, S.R. 2005. Congenital adrenal hyperplasia. Lancet 365:2125-2136.

Miller, D.M. and Shakes, D.C. 1995. Immunofluorescence microscopy. Methods Cell Biol. 48:365-394.

Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bradley, P., Bork, P., Bucher, P., Cerutti, L., Copley, R., Courcelle, E., Das, U., Durbin, R., Fleischmann, W., Gough, J., Haft, D., Harte, N., Hulo, N., Kahn, D., Kanapin, A., Krestyaninova, M., Lonsdale, D., Lopez, R., Letunic, I., Madera, M., Maslen, J., McDowall, J., Mitchell, A., Nikolskaya, A.N., Orchard, S., Pagni, M., Ponting, C.P., Quevillon, E., Selengut, J., Sigrist, C.J., Silventoinen, V., Studholme, D.J., Vaughan, R., and Wu, C.H. 2005. InterPro, progress and status in 2005. Nucleic Acids Res. 33:D201-D205.

Müller, H.M., Kenny, E.E., and Sternberg, P.W. 2004. Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2:e309.

Nardone, J., Lee, D.U., Ansel, K.M., and Rao, A. 2004. Bioinformatics for the 'bench biologist': how to find regulatory regions in genomic DNA. Nat. Immunol. 5:768-774.

O'Connell, K.F., Leys, C.M., and White, J.G. 1998. A genetic screen for temperature-sensitive cell-division mutants of Caenorhabditis elegans. Genetics 149:1303-1321.

Pogue, D. 2005. Mac OS X: The Missing Manual, Tiger Edition. O'Reilly & Associates, Inc., Sebastopol, Calif.

Reese, G., Yarger, R.J., and King, T. 2002. Managing and Using MySQL, 2cd. ed. O'Reilly & Associates, Inc., Sebastopol, Calif.

Riddle, D.L., Blumenthal, T., Meyer, B.J., and Priess, J.R. (eds.). 1997. C. elegans II. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

Schuler, G.D. 1997. Sequence mapping by electronic PCR. Genome Res. 7:541-550.

Schwartz, R.L., Phoenix, T., and Foy, B.D. 2005. Learning Perl, 4th. ed. O'Reilly & Associates, Inc., Sebastopol, Calif.

Schwarz, E.M., Stein, L.D., and Sternberg, P.W. 2002. Caenorhabditis elegans databases. Curr. Genomics 3:111-119.

Sieburth, D., Ch'ng, Q., Dybbs, M., Tavazoie, M., Kennedy, S., Wang, D., Dupuy, D., Rual, J.F., Hill, D.E., Vidal, M., Ruvkun, G., and Kaplan, J.M. 2005. Systematic analysis of genes required for synapse structure and function. Nature 436:510-517.

Simmer, F., Moorman, C., Van Der Linden, A.M., Kuijk, E., Van Den Berghe, P.V., Kamath, R., Fraser, A.G., Ahringer, J., and Plasterk, R.H. 2003. Genome-wide RNAi of C. elegans using the hypersensitive rrf-3 strain reveals novel gene functions. PLoS Biol. 1:E12.

Simpson, P.T., Reis-Filho, J.S., Gale, T., and Lakhani, S.R. 2005. Molecular evolution of breast cancer. J. Pathol. 205:248-254.

Stajich, J.E., Block, D., Boulez, K., Brenner, S.E., Chervitz, S.A., Dagdigian, C., Fuellen, G., Gilbert, J.G., Korf, I., Lapp, H., Lehvaslaiho, H., Matsalla, C., Mungall, C.J., Osborne, B.I., Pocock, M.R., Schattner, P., Senger, M., Stein, L.D., Stupka, E., Wilkinson, M.D., and Birney, E. 2002. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 12:1611-1618.

Stein, L.D. and Thierry-Mieg, J. 1998. Scriptable access to the Caenorhabditis elegans genome sequence and other ACEDB databases. Genome Res. 8:1308-1315.

Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The Generic Genome Browser: A building block for a model organism system database. Genome Res. 12:1599-1610.

Stein, L.D., Bao, Z., Blasiar, D., Blumenthal, T., Brent, M.R., Chen, N., Chinwalla, A., Clarke, L., Clee, C., Coghlan, A., Coulson, A., D'Eustachio, P., Fitch, D.H., Fulton, L.A., Fulton, R.E., Griffiths-Jones, S., Harris, T.W., Hillier, L.W., Kamath, R., Kuwabara, P.E., Mardis, E.R., Marra, M.A., Miner, T.L., Minx, P., Mullikin, J.C., Plumb, R.W., Rogers, J., Schein, J.E., Sohrmann, M., Spieth, J., Stajich, J.E., Wei, C., Willey, D., Wilson, R.K., Durbin, R., and Waterston, R.H. 2003. The genome sequence of Caenorhabditis briggsae: A platform for comparative genomics. PLoS Biol. 1:E45.

Stone, M., Ockman, S., and DiBona, C. (eds.). 1999. Open Sources: Voices From the Open Source Revolution. O'Reilly & Associates, Inc., Sebastopol, Calif.

Sulston, J.E., Schierenberg, E., White, J.G., and Thomson, J.N. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100:64-119.

Swan, K.A., Curtis, D.E., McKusick, K.B., Voinov, A.V., Mapa, F.A., and Cancilla, M.R. 2002. High-throughput gene mapping in Caenorhabditis elegans. Genome Res. 12:1100-1105.

Tabara, H., Motohashi, T., and Kohara, Y. 1996. A multi-well version of in situ hybridization on whole mount embryos of Caenorhabditis elegans. Nucleic Acids Res. 24:2119-2124.

Tisdall, J.D. 2003. Mastering Perl for Bioinformatics. O'Reilly & Associates, Inc., Sebastopol, Calif.

Welsh, M., Dalheimer, M.K., Dawson, T., and Kaufman, L. 2002. Running Linux, 4th ed. O'Reilly & Associates, Inc., Sebastopol, Calif.

White, J.G., Southgate, E., Thomson, J.N., and Brenner, S. 1986. The structure of the nervous system of Caenorhabditis elegans. Philos. Trans. R. Soc. Lond. B Biol. Sci. 314:1-340.

Wicks, S.R., Yeh, R.T., Gish, W.R., Waterston, R.H., and Plasterk, R.H. 2001. Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map. Nat Genet. 28:160-164.

Wood, W.B. (ed.). 1988. The nematode Caenorhabditis elegans. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

Wootton, J.C. 1994. Non-globular domains in protein sequences: automated segmentation using complexity measures. Comput. Chem. 18:269-285.