• Publications
  • Influence
Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions
TLDR
This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. Expand
  • 3,769
  • 694
  • PDF
Cluster ensembles: a knowledge reuse framework for combining partitionings
TLDR
We formally define the cluster ensemble problem as an optimization problem and propose three effective and efficient combiners for solving it based on a hypergraph model. Expand
  • 466
  • 104
  • PDF
Impact of Similarity Measures on Web-page Clustering
Clustering of web documents enables (semi-)automated categorization, and facilitates certain types of search. Any clustering method has to embed the documents in a suitable similarity space. WhileExpand
  • 799
  • 47
  • PDF
Relationship-based clustering and cluster ensembles for high-dimensional data mining
This dissertation takes a relationship-based approach to cluster analysis of high (1000 and more) dimensional data that side-steps the ‘curse of dimensionality’ issue by working in a suitableExpand
  • 211
  • 28
  • PDF
Relationship-Based Clustering and Visualization for High-Dimensional Data Mining
TLDR
In several real-life data-mining applications, data reside invery high (1000 or more) dimensional space, where both clustering techniques developed for low-dimensional spaces ( k-means, BIRCH, CLARANS, CURE, DBScan, etc.) as well as visualization methods such as parallel coordinates or projective visualizations are rendered ineffective. Expand
  • 137
  • 7
  • PDF
Value-based customer grouping from large retail data sets
TLDR
We propose OPOSSUM, a novel similarity-based clustering algorithm using constrained, weighted graph- partitioning. Expand
  • 78
  • 3
A Scalable Approach to Balanced, High-Dimensional Clustering of Market-Baskets
TLDR
This paper presents Opossum, a novel similarity-based clustering approach based on constrained, weighted graph-partitioning. Expand
  • 79
  • 3
Distance based clustering of association rules
Association rule mining is one of the most important procedures in data mining. In industry applications, often more than 10,000 rules are discovered. To allow manual insepection and supportExpand
  • 67
  • 2
  • PDF
Detecting moving objects in airborne forward looking infra-red sequences
In this paper we propose a system that detects independently moving objects (IMOs) in forward looking infra-red (FLIR) image sequences taken from an airborne, moving platform. Ego-motion effects areExpand
  • 58
  • 1
  • PDF
Similarity-Based Text Clustering: A Comparative Study
TLDR
We compare popular similarity measures (Euclidean, cosine, Pearson correlation, extended Jaccard) in conjunction with several clustering techniques (random, self-organizing feature map, hypergraph partitioning, generalized k-means, weighted graph partitioning), on a variety of high dimension sparse vector data sets representing text documents. Expand
  • 26
  • 1
  • PDF
...
1
2
...