Learn More
Machine learning and data mining have become aware that using constraints when learning patterns and rules can be very useful. To this end, a large number of special purpose systems and techniques have been developed for solving such constraint-based mining and learning problems. These techniques have, so far, been developed independently of the general(More)
—Recently, constraint programming has been proposed as a declarative framework for constraint-based pattern mining. In constraint programming, a problem is modelled in terms of constraints and search is done by a general solver. Similar to most pattern mining algorithms, these solvers typically employ exhaustive depth-first search, where constraints are(More)
The relationship between constraint-based mining and constraint programming is explored by showing how the typical constraints used in pattern mining can be formulated for use in constraint programming environments. The resulting framework is surprisingly flexible and allows us to combine a wide range of mining constraints in different ways. We implement(More)
Correlated or discriminative pattern mining is concerned with finding the highest scoring patterns w.r.t. a correlation measure (such as information gain). By reinterpreting correlation measures in ROC space and formulating correlated itemset mining as a constraint programming problem, we obtain new theoretical insights with practical benefits. More(More)
The problem of local pattern mining can be formalised as that of finding the set of patterns Th(L, p, D) = {π ∈ L|p(π, D) is true}, that is, the set of all patterns π ∈ L that satisfy a constraint p with respect to a database D. Numerous approaches to pattern mining have been developed to effectively find the patterns adhering to a set of constraints.(More)
Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the(More)
We introduce MiningZinc, a general framework for constraint-based pattern mining, one of the most popular tasks in data mining. MiningZinc consists of two key components: a language component and a toolchain component. The language allows for high-level and natural modeling of mining problems, such that MiningZinc models closely resemble definitions found(More)
The goal of constraint-based sequence mining is to find sequences of symbols that are included in a large number of input sequences and that satisfy some constraints specified by the user. Many constraints have been proposed in the literature, but a general framework is still missing. We investigate the use of constraint programming as general framework for(More)
In recent years, a large number of algorithms have been proposed for finding set patterns in boolean data. This includes popular mining tasks based on, for instance, frequent (closed) itemsets. In this chapter, we develop a common framework in which these algorithms can be studied thanks to the principles of constraint programming. We show how such(More)