Imperfect DNA mirror repeats in E. coli TnsA and other protein-coding DNA.


DNA imperfect mirror repeats (DNA-IMRs) are ubiquitous in protein-coding DNA. However, they overlap and often have different centers of symmetry, making it difficult to evaluate their relationship to each other and to specific DNA and protein motifs and structures. This paper describes a systematic method of determining a hierarchy for DNA-IMRs and evaluates their relationship to protein structural elements (PSEs)--helices, turns and beta-sheets. DNA-IMRs are identifed by two different methods--DNA-IMRs terminated by reverse dinucleotides (rd-IMRs) and DNA-IMRs terminated by a single (mono) matching nucleotide (m-IMRs). Both rd-IMRs and m-IMRs are evaluated in 17 proteins, and illustrated in detail for TnsA. For each of the proteins, Fisher's exact test (FET) is used to measure the coincidence between the terminal dinucleotides of rd-IMRs and the terminal amino acids of individual PSEs. A significant correlation over a span of about 3 nt was found for each protein. The correlation is robust and for most genes, all rd-IMRs<or=13 nt can be removed without the loss of statistical significance. In TnsA, the protein intervals translated by rd-IMRs>16 nt contain approximately 88% of the potential functional motifs. The protein translation of the longest rd- and m-IMRs span sequences important to the protein's structure and function. In all 17 proteins studied, the population of rd-IMRs is substantially less than the expected number and the population of m-IMRs greater than the expected number, indicating strong selective pressures. The association of rd-IMRs with PSEs restricts their spatial distribution, and therefore, their number. The greater than predicted number of m-IMRs indicates that DNA symmetry exists throughout the entire protein-coding region and may stabilize the sequence.

Cite this paper

@article{Lang2005ImperfectDM, title={Imperfect DNA mirror repeats in E. coli TnsA and other protein-coding DNA.}, author={Dorothy Lang}, journal={Bio Systems}, year={2005}, volume={81 3}, pages={183-207} }