SNP analysis implicates role of cytosine methylation in introducing consequential mutations in Vibrio cholerae genomes
A base mismatch correction process in E. coli K-12 called Very Short Patch (VSP) repair corrects T:G mismatches to C:G when found in certain sequence contexts. Two of the substrate mismatches (5'-CTWGG/3'-GGW'CC; W = A or T) occur in the context of cytosine methylation in DNA and reduce the mutagenic effects of 5-methylcytosine deamination to thymine. However, VSP repair is also known to repair T:G mismatches that are not expected to arise from 5-methylcytosine deamination (example--CTAG/GGT-C). In these cases, if the original base pair were a T:A, VSP repair would cause a T to C transition. We have carried out Markov chain analysis of an E. coli sequence database to determine if repair at the latter class of sites has altered the abundance of the relevant tetranucleotides. The results are consistent with the prediction that VSP repair would tend to deplete the genome of the 'T' containing sequences (example--CTAG), while enriching it for the corresponding 'C' containing sequences (CCAG). Further, they provide an explanation for the known scarcity of CTAG containing restriction enzyme sites among the genomes of enteric bacteria and identify VSP repair as a force in shaping the sequence composition of bacterial genomes.