# Hidden Markov Chains and the Analysis of Genome Structure

@article{Churchill1992HiddenMC, title={Hidden Markov Chains and the Analysis of Genome Structure}, author={G. Churchill}, journal={Comput. Chem.}, year={1992}, volume={16}, pages={107-115} }

Abstract In this paper, statistical methods based on a hidden Markov chain model are used to study the structure of some small complete genomes and a human genome segment. A variety of discrete compositional domains are discovered and their correlations with genome function are explored.

#### Topics from this paper

#### 128 Citations

Modelling Bacterial Genomes Using Hidden Markov Models

- Computer Science
- COMPSTAT
- 1998

This work compares different identification algorithms for hidden Markov chains and presents some applications to bacterial genomes to illustrate the method. Expand

Hidden Markov models in biology.

- Medicine
- Methods in molecular biology
- 2010

In the course, the forward-backward, the Viterbi, the Baum-Welch (EM) algorithm, and a Metropolis sampling scheme are presented. Expand

A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models

- Mathematics
- 2001

This paper describes a Bayesian approach to determining the number of hidden states in a hidden Markov model (HMM) via reversible jump Markov chain Monte Carlo (MCMC) methods. Acceptance rates for… Expand

A Bayesian approach to DNA sequence segmentation.

- Computer Science, Medicine
- Biometrics
- 2004

A Bayesian method is described that identifies segments by using a Markov chain governed by a hidden Markov model to segmentation of the bacteriophage lambda genome, a common benchmark sequence used for the comparison of statistical segmentation algorithms. Expand

Bayesian Restoration of a Hidden Markov Chain with Applications to DNA Sequencing

- Medicine, Mathematics
- J. Comput. Biol.
- 1999

This work presents a Bayesian solution to the problem of restoring the sequence of states visited by the hidden Markov chain from a given sequence of observed outputs through the Bayesian approach to HMM restoration. Expand

Comparing the performance of a reversible jump Markov chain Monte Carlo algorithm for DNA sequences alignment

- Mathematics
- 2006

Assume that K independent copies are made from a common prototype DNA sequence whose length is a random variable. In this paper, the problem of aligning those copies and therefore the problem of… Expand

Markov models of genome segmentation.

- Computer Science, Medicine
- Physical review. E, Statistical, nonlinear, and soft matter physics
- 2007

The advantage of higher-order Markov-model-based segmentation procedures in detecting compositional inhomogeneity in chimeric DNA sequences constructed from genomes of diverse species, and in application to the E. coli K12 genome, boundaries of genomic islands, cryptic prophages, and horizontally acquired regions are accurately identified. Expand

Comparative statistical analysis of bacteria genomes in “word” context

- Mathematics, Physics
- 2001

It has been revealed that the word ranked distributions are quite well approximated by logarithmic law and the results obtained in the absent word investigation show the considerably nonrandom character of DNA texts. Expand

Finding Genes in Human DNA with a Hidden Markov Model

- Computer Science, Mathematics
- ISMB 1996
- 1996

The initial results are highly encouraging and indicate that an HMM can form the basis of an eeective gene-nding system. Expand

Estimating dependent Binomial mixture models through reversible jump MCMC

- 2015

We present a hidden Markov model of Binomial variables as a dependent mixture model and propose the reversible jump procedure to estimate the number of components and parameters of the model and… Expand

#### References

SHOWING 1-10 OF 24 REFERENCES

Stochastic models for heterogeneous DNA sequences.

- Biology, Medicine
- Bulletin of mathematical biology
- 1989

The DNA sequence is viewed as a stochastic process with local compositional properties determined by the states of a hidden Markov chain, a discrete-state, discrete-outcome version of a general model for non-stationary time series proposed by Kitagawa (1987). Expand

Theoretical models for heterogeneity of base composition in DNA.

- Biology, Medicine
- Journal of theoretical biology
- 1974

It is concluded that the heterogeneity is probably caused by variations in the relative use of synonymous codons in different genes, and a model in which the DNA consists of a sequence of “segments” with different underlying base compositions is concluded. Expand

Base compositional structure of genomes.

- Biology, Medicine
- Genomics
- 1992

A significant shift in the style of domain models is suggested, in which the variation of A+T content with position is modeled by a random walk with frequent small steps rather than with large quantum jumps, to reduce the amount of computation in the assembly of large sequences from sequences of randomly chosen fragments. Expand

Sequence and organization of the human mitochondrial genome

- Biology, Medicine
- Nature
- 1981

The complete sequence of the 16,569-base pair human mitochondrial genome is presented and shows extreme economy in that the genes have none or only a few noncoding bases between them, and in many cases the termination codons are not coded in the DNA but are created post-transcriptionally by polyadenylation of the mRNAs. Expand

The genome of simian virus 40.

- Biology, Medicine
- Science
- 1978

The nucleotide sequence of SV40 DNA was determined, and the sequence was correlated with known genes of the virus and with the structure of viral messenger RNA's. There is a limited overlap of the… Expand

Statistical characterization of nucleic acid sequence functional domains.

- Biology, Medicine
- Nucleic acids research
- 1983

This report investigated the statistical measures most distinctive of the various domains of the genome and then linked them to current understandings in so far as possible and suggested others. Expand

Giant G+C% mosaic structures of the human genome found by arrangement of GenBank human DNA sequences according to genetic positions.

- Biology, Medicine
- Genomics
- 1990

To determine the overall variation in the G+C% distribution over long ranges of the human genome, DNA sequences of human genes, which were closely linked genetically or physically, were surveyed from the GenBank Data Bank and found that sequences within each group almost always had similar G-C% levels, but those belonging to different groups often had different levels. Expand

Complete sequence of bovine mitochondrial DNA. Conserved features of the mammalian mitochondrial genome.

- Biology, Medicine
- Journal of molecular biology
- 1982

The bovine 12 S and 16 S Ribosomal RNA genes, when compared with those from human mitochondrial DNA, show conserved features that are consistent with proposed secondary structure models for the ribosomal RNAs. Expand

Sequence and gene organization of mouse mitochondrial DNA

- Medicine, Biology
- Cell
- 1981

The mouse mitochondrial DNA genome is highly homologous in overall sequence and in gene organization to human mitochondrial DNA, with the descending order of conserved regions being tRNA genes; origin of light-strand replication; r RNA genes; knownprotein-coding genes; unidentified protein-c coding genes; displacement-loop region. Expand

CpG-rich islands and the function of DNA methylation

- Biology, Medicine
- Nature
- 1986

It is likely that most vertebrate genes are associated with ‘HTF islands’—DNA sequences in which CpG is abundant and non-methylated. Highly tissue-specific genes, though, usually lack islands. The… Expand