Klump H. Codification and evolution of experimentally observed specific recognition sites for restriction enzymes on DNA.
Biosystems 1987;
21:33-49. [PMID:
2825826 DOI:
10.1016/0303-2647(87)90005-0]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The list of published restriction endonucleases along with their substrates provides an excellent data base for the evaluation of the evolution and codification of the key elements for specific recognition sites on the DNA. In this paper the considerations will be limited to palindromic tetramer-, pentamer-, and hexamer-sequences. It is basically assumed that each base pair within these sequences has to be recognized by directionally unique bidentate hydrogen bonds either within the plane of the base pair or by bridging the appropriate H-bond donor/acceptor groups of the neighbouring bases of the same strand. Thus sequence specificity is mediated by twelve (eight) H-bonds, originating from the protein recognition modules. Besides a pronounced preference for GC base pairs expressed by their high frequency in the most abundant sequences, serving the need of maximal thermodynamic stability of the double helical substrates, it can also be shown that the stacking of consecutive bases within the recognition site sequences plays a major role in shaping the particular DNA/protein interface. Finally it will be demonstrated that the full set of sequences discussed in this paper can readily be derived by stepwise expanding the vocabulary of three simple tetrameric sequences by inserting single base pairs into the centre of a minimal sequence, thus creating all the published pentameric restriction sites, or by inserting/adding two GC base pairs in a palindromic way, thus creating the known multiplicity of hexameric sites.
Collapse