Chapter 9: Gene Prediction and Regulation

With the advent of whole-genome sequencing projects, it has become routine to scan genomic DNA sequences to find genes, particularly those that encode proteins. The most likely protein-encoding regions are identified in a new genomic sequence and the predicted proteins are then subjected to a database similarity search to assess their function, as described in Chapter 6. This procedure is summarized in the gene prediction flowchard (p. 374). The genomic DNA sequence displayed on a Web page is then annotated with sequence positions marked to show the exon–intron structure and location of each predicted gene.

In this chapter, methods of predicting genes that encode proteins are discussed, and then methods for identifying regulatory regions in the genomic sequence that regulate the activity of protein-encoding genes are discussed. The prediction of genes that specify classes of RNA molecules is discussed in Chapter 8. A more extended review of genome analysis is presented in Chapter 11. There are a sizable number of computer programs and Web sites for gene prediction. Some representative ones are given in Table 9.1 (p. 368). General and specific Web search terms are provided at the end of the chapter.


