Chapter 5: Multiple Sequence Alignment

One of the most important contributions of biological sequences to evolutionary analysis is the discovery that sequences of different organisms are often related. Similar genes are conserved across widely divergent species, often performing a similar or even identical function, and at other times, mutating or rearranging to perform an altered function through the forces of natural selection. Thus, many genes are represented in highly conserved forms in a wide range of organisms. Through simultaneous alignment of the sequences of these genes, the patterns of change in the sequences may be analyzed. Because the potential for learning about the structure and function of molecules by multiple sequence alignment (msa) is so great, the necessary computational methods have received a great deal of attention. In msa, sequences are aligned optimally by bringing the greatest number of similar characters into register in the same column of the alignment, just as described in Chapter 3 for the alignment of two sequences.

As with aligning a pair of sequences, the difficulty in aligning a group of sequences varies considerably, being much greater as the degree of sequence similarity decreases. If the amount of sequence variation is minimal, it is quite straightforward to align the sequences, even without the assistance of a computer program. However, if the amount of sequence variation is great, it may be very difficult to find an optimal alignment of the sequences because so many combinations of substitutions, insertions, and deletions, each predicting a different alignment, are possible.


