#### Filter Results:

- Full text PDF available (54)

#### Publication Year

1995

2018

- This year (6)
- Last 5 years (33)
- Last 10 years (49)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Arno J. Knobbe
- 1999

An important aspect of data mining algorithms and systems is that they should scale well to large databases. A consequence of this is that most data mining tools are based on machine learning algorithms that work on data in attribute-value format. Experience has proven that such ’single-table’ mining algorithms indeed scale well. The downside of this format… (More)

- Arno J. Knobbe, Marc de Haas, Arno Siebes
- PKDD
- 2001

The fact that data is scattered over many tables causes many problems in the practice of data mining. To deal with this problem, one either constructs a single table by hand, or one uses a Multi-Relational Data Mining algorithm. In this paper, we propose a different approach in which the single table is constructed automatically using aggregate functions,… (More)

- Arno J. Knobbe, Eric K. Y. Ho
- PKDD
- 2006

Pattern discovery algorithms typically produce many interesting patterns. In most cases, patterns are reported based on their individual merits, and little attention is given to the interestingness of a pattern in the context of other patterns reported. In this paper, we propose filtering the returned set of patterns based on a number of quality measures… (More)

- Arno J. Knobbe, Eric K. Y. Ho
- KDD
- 2006

In this paper we present a new approach to mining binary data. We treat each binary feature (item) as a means of distinguishing two sets of examples. Our interest is in selecting from the total set of items an itemset of specified size, such that the database is partitioned with as uniform a distribution over the parts as possible. To achieve this goal, we… (More)

- Wouter Duivesteijn, A. J. Feelders, Arno J. Knobbe
- Data Mining and Knowledge Discovery
- 2008

Finding subsets of a dataset that somehow deviate from the norm, i.e. where something interesting is going on, is a classical Data Mining task. In traditional local pattern mining methods, such deviations are measured in terms of a relatively high occurrence (frequent itemset mining), or an unusual distribution for one designated target attribute (common… (More)

In this paper we present LeGo, a generic framework that utilizes existing local pattern mining techniques for global modeling in a variety of diverse data mining tasks. In the spirit of well known KDD process models, our work identifies different phases within the data mining step, each of which is formulated in terms of different formal constraints. It… (More)

- Matthijs van Leeuwen, Arno J. Knobbe
- ECML/PKDD
- 2011

Large and complex data is challenging for most existing discovery algorithms, for several reasons. First of all, such data leads to enormous hypothesis spaces, making exhaustive search infeasible. Second, many variants of essentially the same pattern exist, due to (numeric) attributes of high cardinality, correlated attributes, and so on. This causes top-k… (More)

- Matthijs van Leeuwen, Arno J. Knobbe
- Data Mining and Knowledge Discovery
- 2012

Large data is challenging for most existing discovery algorithms, for several reasons. First of all, such data leads to enormous hypothesis spaces, making exhaustive search infeasible. Second, many variants of essentially the same pattern exist, due to (numeric) attributes of high cardinality, correlated attributes, and so on. This causes top-k mining… (More)

- Arno J. Knobbe, Arno Siebes, Bart Marseille
- PKDD
- 2002

The fact that data is scattered over many tables causes many problems in the practice of data mining. To deal with this problem, one either constructs a single table by propositionalisation, or uses a Multi-Relational Data Mining algorithm. In either case, one has to deal with the non-determinacy of one-to-many relationships. In propositionalisation,… (More)

- Wouter Duivesteijn, Arno J. Knobbe
- 2011 IEEE 11th International Conference on Data…
- 2011

Subgroup discovery suffers from the multiple comparisons problem: we search through a large space, hence whenever we report a set of discoveries, this set will generally contain false discoveries. We propose a method to compare subgroups found through subgroup discovery with a statistical model we build for these false discoveries. We determine how much the… (More)