Learn More
We introduce inhomogeneous parsimonious Markov models for modeling statistical patterns in discrete sequences. These models are based on parsimonious context trees, which are a generalization of context trees, and thus generalize variable order Markov models. We follow a Bayesian approach, consisting of structure and parameter learning. Structure learning(More)
Statistical modeling of transcription factor binding sites is one of the classical fields in bioinformatics. The position weight matrix (PWM) model, which assumes statistical independence among all nucleotides in a binding site, used to be the standard model for this task for more than three decades but its simple assumptions are increasingly put into(More)
Inhomogeneous parsimonious Markov models have recently been introduced for modeling symbolic sequences, with a main application being DNA sequence analysis. Structure and parameter learning of these models has been proposed using a Bayesian approach, which entails the practically challenging choice of the prior distribution. Cross validation is a possible(More)
Context trees (CT) are a widely used tool in machine learning for representing context-specific independences in conditional probability distributions. Parsimonious context trees (PCTs) are a recently proposed generalization of CTs that can enable statistically more efficient learning due to a higher structural flexibility, which is particularly useful for(More)
The transcription of genes is often regulated not only by transcription factors binding at single sites per promoter, but by the interplay of multiple copies of one or more transcription factors binding at multiple sites forming a cis-regulatory module. The computational recognition of cis-regulatory modules from ChIP-seq or other high-throughput data is(More)
The binding affinity of DNA-binding proteins such as transcription factors is mainly determined by the base composition of the corresponding binding site on the DNA strand. Most proteins do not bind only a single sequence, but rather a set of sequences, which may be modeled by a sequence motif. Algorithms for de novo motif discovery differ in their promoter(More)
Parsimonious Markov models have been recently developed as a generalization of variable order Markov models. Many practical applications involve a setting with latent variables, with a common example being mixture models. Here, we propose a Bayesian model averaging approach for learning mixtures of parsimonious Markov models that is based on Gibbs sampling.(More)
Summary Recent studies have shown that the traditional position weight matrix model is often insufficient for modeling transcription factor binding sites, as intra-motif dependencies play a significant role for an accurate description of binding motifs. Here, we present the Java application InMoDe, a collection of tools for learning, leveraging and(More)