Learn More
The relationship between constraint-based mining and constraint programming is explored by showing how the typical constraints used in pattern mining can be formulated for use in constraint programming environments. The resulting framework is surprisingly flexible and allows us to combine a wide range of mining constraints in different ways. We implement(More)
We introduce the problem of k-pattern set mining, concerned with finding a set of k related patterns under constraints. This contrasts to regular pattern mining, where one searches for many individual patterns. The k-pattern set mining problem is a very general problem that can be instantiated to a wide variety of well-known mining tasks including(More)
Machine learning and data mining have become aware that using constraints when learning patterns and rules can be very useful. To this end, a large number of special purpose systems and techniques have been developed for solving such constraint-based mining and learning problems. These techniques have, so far, been developed independently of the general(More)
Recently, constraint programming has been proposed as a declarative framework for constraint-based pattern mining. In constraint programming, a problem is modelled in terms of constraints and search is done by a general solver. Similar to most pattern mining algorithms, these solvers typically employ exhaustive depth-first search, where constraints are used(More)
Correlated or discriminative pattern mining is concerned with finding the highest scoring patterns w.r.t. a correlation measure (such as information gain). By reinterpreting correlation measures in ROC space and formulating correlated itemset mining as a constraint programming problem, we obtain new theoretical insights with practical benefits. More(More)
The goal of constraint-based sequence mining is to find sequences of symbols that are included in a large number of input sequences and that satisfy some constraints specified by the user. Many constraints have been proposed in the literature, but a general framework is still missing. We investigate the use of constraint programming as general framework for(More)
We introduce MiningZinc, a general framework for constraint-based pattern mining, one of the most popular tasks in data mining. MiningZinc consists of two key components: a language component and a toolchain component. The language allows for high-level and natural modeling of mining problems, such that MiningZinc models closely resemble definitions found(More)
Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the(More)