Use of a 3D structure data base for understanding sequence-dependent conformational aspects of DNA.
Local variations in B-DNA helix structure are compared among three decamers and eight dodecamers, which contain examples of all ten base-pair step types. All pairwise combinations of helix parameters are compared by linear regression analysis, in a search for internal relationships as well as correlations with base sequence. The primary conclusions are: (1) Three-center hydrogen bonds between base-pairs occur frequently in the major groove at C-C, C-A, A-A and A-C steps, but are less convincing at C-C and C-T steps in the minor groove. The requirements for large base-pair propeller are (1) that the base-pair should be A.T rather than G.C, and (2) that it be involved in a major groove three-center hydrogen bond with the following base-pair. Either condition alone is insufficient. Hence, a large propeller is expected at the leading base-pair of A-A and A-C steps, but not at A-T, T-A, C-A or C-C steps. (2) A systematic and quantitative linkage exists between helix variables twist, rise, cup and roll, of such strength that the rise between base-pairs can hardly be described as an independent variable at all. Two typical patterns of behavior are observed at steps from one base-pair to the next: high twist profile (HTP), characterized by high twist, low rise, positive cup and negative roll, and low twist profile (LTP), marked by low twist, high rise; negative cup and positive roll. Examples of HTP are steps G-C, G-A and Y-C-A-R, where Y is pyrimidine and R is purine. Examples of LTP steps are C-G, G-G, A-G and C-A steps other than Y-C-A-R. (3) The minor groove is especially narrow across the two base-pairs of the following steps: A-T, T-A, A-A and G-A. (4) In general, base step geometry cannot be correlated solely with the bases that define the step in question; the two flanking steps also must be taken into account. Hence, local helix structure must be studied in the context, not of two base-pairs: A-B, but of four: x-A-B-y. Calladine's rules, although too simple in detail, were correct in defining the length of sequence over which a given perturbation is expressed. Whereas ten different two-base steps are possible, allowing for the identity of complementary sequences, there are 136 different four-base steps. Only 33 of these 136 four-base steps are represented in the decamer and dodecamer structures solved to date, and hence it is premature to try to set up detailed structural algorithms. (5) The sugar-phosphate backbone chains of B-DNA place strong limits on sequence-induced structural variation, damping down most variables within four or five base-pairs, and preventing purine-purine anti-anti mismatches from causing bulges in the double helix. Hence, although short-range sequence-induced deformations (or deformability) are observed, long-range deformations propagated down the helix are not to be expected.