BASIC PROTOCOL 9: FINDING SEQUENCE SIMILARITIES WITH BLAST

From WormBaseWiki
Jump to: navigation, search

This protocol describes how to BLAST a sequence against the sequences in WormBase. Further discusion of the BLAST algorithm can be found in UNITS 3.3, 3.4, & 3.11.

Necessary Resources

Hardware

   A standard computer with a reasonably fast connection to the Internet (cable modem, DSL, or Ethernet recommended)

Software

   Web browser such as Internet Explorer (http://www.microsoft.com) or Mozilla (http://www.mozilla.org)

1. Click the Blast/Blat link at the top middle left of the WormBase front page (http://www.wormbase.org/wiki/index.php/Image:Fig_1_8_01.png). This will lead to http://blast.wormbase.org/db/searches/blat. This is the method of choice for searching the C. elegans genome with a non-worm protein or nucleic acid sequence.

For instance, consider the human gene DYMECLIN (DYM), which when mutated leads to Dyggve-Melchior-Clausen or Smith-McCort dysplasia (Cohn et al., 2003; El Ghouzzi et al., 2003). Suppose one would like to use C. elegans as a model system for dissecting DYMs function. Which, if any, C. elegans genes are significantly similar to DYM? Performing BLASTP on WormPep with the DYM protein sequence yields one strong hit, the gene C47D12.2, so far uncharacterized.

Fig_1_8_15.png

In addition to the plain orthology between DYM and C47D12.2, a weaker but also significant similarity is visible to two protein products of the hid-1 gene, required for normal muscular activity and insulin signaling (Ailion and Thomas, 2003). BLAST searches can thus reveal not only orthologies but also paralogies that may suggest further clues to gene function.

While BLAST is designed to efficiently search genomes for subtle matches to protein-coding sequences, BLAT is aimed at quickly finding strong (95% to 100%) identities of genomic DNA to (‚â•40-residue) nucleotide sequences (Kent, 2002).

2. To examine the exact match between a query sequence and a given BLAST hit, click on its Alignment link (in the Details section of the tabulated BLAST outputs; (http://www.wormbase.org/wiki/index.php/Image:Fig_1_8_15.png). This will yield BLAST results in the standard high-scoring segment pair format (Korf et al., 2003).

3. To examine the location of a BLAST hit in the context of the C. elegans genome, go to its Genome View link instead of its Alignment link (http://www.wormbase.org/wiki/index.php/Image:Fig_1_8_15.png). This will give a Genome Browser view with the BLAST hit mapped onto the genome.