CpG distribution patterns in methylated and non-methylated species.


To characterize the extent of DNA methylation and its possible biological roles in a wide variety of organisms, we have analyzed gene sequences extracted from the GenBank database. Sequences of both methylated and non-methylated species were used for comparative analysis. The local CpG dinucleotide distribution near the 5' ends of genes as well as the degree of overall CpG suppression/depletion in the entire gene region were examined in all complete gene sequences for each species. We show that the distribution patterns of CpG near the 5' region of genes differ among vertebrates, invertebrates, plants and bacteria. CpG island-like peaks in CpG O/E (observed/expected ratio) were observed not only in methylated species, but also in non-methylated species. In methylated non-vertebrates, overall CpG O/E values were lower, and peaks in the CpG profile of 5' regions were larger than in non-methylated species. We discuss the implications of such biases with respect to DNA methylation.


