Density‐based clustering

@article{Kriegel2010DensitybasedC,
  title={Density‐based clustering},
  author={Hans-Peter Kriegel and Peer Kr{\"o}ger and J{\"o}rg Sander and Arthur Zimek},
  journal={Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery},
  year={2010},
  volume={1}
}
  • H. Kriegel, P. Kröger, A. Zimek
  • Published 1 May 2011
  • Computer Science
  • Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Clustering refers to the task of identifying groups or clusters in a data set. In density‐based clustering, a cluster is a set of data objects spread in the data space over a contiguous region of high density of objects. Density‐based clusters are separated from each other by contiguous regions of low density of objects. Data objects located in low‐density regions are typically considered noise or outliers. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 231–240 DOI: 10… 

Density‐based clustering

  • J. Sander
  • Computer Science
    Encyclopedia of Machine Learning and Data Mining
  • 2010
TLDR
This review article discusses the statistical notion of density‐based clusters, classic algorithms for deriving a flat partitioning ofdensity‐ based clusters, methods for hierarchical density‐ based clustering, and methods for semi‐supervised clustering.

Efficient Density-Based Subspace Clustering in High Dimensions

TLDR
This short survey discusses challenges in this area, and presents models and algorithms for efficient and scalable density-based subspace clustering.

Density-Based Clustering Based on Hierarchical Density Estimates

TLDR
This work proposes a theoretically and practically improved density-based, hierarchical clustering method, providing a clustering hierarchy from which a simplified tree of significant clusters can be constructed, and proposes a novel cluster stability measure.

DECWA: Density-Based Clustering using Wasserstein Distance

TLDR
A new characterization of clusters and a new clustering algorithm based on spatial density and probabilistic approach that outperforms other state-of-the-art density-based clustering methods on a wide variety of datasets.

Cluster Analysis of Data with Reduced Dimensionality: An Empirical Study

TLDR
Several clustering algorithms are used to process low-dimensional projections of complex data sets and compared with each other to assess their suitability to process reduced data sets.

Variable Density Based Genetic Clustering

  • Andrei Sorin Sabau
  • Computer Science
    2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
  • 2012
TLDR
A parameter-free novel genetic clustering algorithm with an original method for encoding clustering solutions relying on density based clustering parameters, which allows for always valid crossover results, with great offspring variations even when using simple crossover operators.

Neighborhood density information in clustering

  • M. Syed
  • Computer Science
    Ann. Math. Artif. Intell.
  • 2022
TLDR
The novelty of the proposed DBC method can be summed up as follows: a hybrid first-second order optimization algorithm for identifying high-density data points; an adaptive scan radius for identifying reachable points.

A Review on Consensus Clustering Methods

TLDR
This chapter provides a review of unsupervised consensus learning techniques based on their underlying theoretical principles, present the exact, approximation, and heuristic approaches, the relation of consensus clustering with other well-studied problems, and discuss relevant applications.
...

References

SHOWING 1-10 OF 58 REFERENCES

Density-Connected Subspace Clustering for High-Dimensional Data

TLDR
SUBCLU (density-connected Subspace Clustering), an effective and efficient approach to the subspace clustering problem, based on a formal clustering notion using the concept of density-connectivity underlying the algorithm DBSCAN [EKSX96].

Clustering high dimensional data

  • I. Assent
  • Computer Science
    WIREs Data Mining Knowl. Discov.
  • 2012
TLDR
An overview of the effects of high‐dimensional spaces, and their implications for different clustering paradigms is provided, with pointers to the literature, and open research issues remain.

Direct Clustering of a Data Matrix

TLDR
This article presents a model, and a technique, for clustering cases and variables simultaneously and the principal advantage in this approach is the direct interpretation of the clusters on the data.

Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data

TLDR
A novel clustering technique that addresses problems with varying densities and high dimensionality, while the use of core points handles problems with shape and size, and a number of optimizations that allow the algorithm to handle large data sets are discussed.

Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

TLDR
The generalized algorithm DBSCAN can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes, and four applications using 2D points (astronomy, 3D points,biology, 5D points and 2D polygons) are presented, demonstrating the applicability of GDBSCAN to real-world problems.

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise

TLDR
DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.

Subspace clustering

TLDR
The problems motivating subspace clustering are sketched, different definitions and usages of subspaces for clusteringare described, and exemplary algorithmic solutions are discussed.

Semi-supervised Density-Based Clustering

TLDR
This work describes how labeled objects can be used to help the algorithm detecting suitable density parameters for the algorithm to extract density-based clusters in specific parts of the feature space.

EDSC: efficient density-based subspace clustering

TLDR
This paper proposes lossless efficient detection of density-based subspace clusters by a complete multistep filter-and-refine algorithm and proves that pruning is lossless in both filter steps, guaranteeing completeness of the result.
...