# DBSCAN: Optimal Rates For Density Based Clustering

@article{Wang2017DBSCANOR, title={DBSCAN: Optimal Rates For Density Based Clustering}, author={Daren Wang and Xin Yang Lu and Alessandro Rinaldo}, journal={arXiv: Statistics Theory}, year={2017} }

We study the problem of optimal estimation of the density cluster tree under various assumptions on the underlying density. Building up from the seminal work of Chaudhuri et al. [2014], we formulate a new notion of clustering consistency which is better suited to smooth densities, and derive minimax rates of consistency for cluster tree estimation for Holder smooth densities of arbitrary degree \alpha. We present a computationally efficient, rate optimal cluster tree estimator based on a…

## 2 Citations

Change-point Detection for Sparse and Dense Functional Data in General Dimensions

- Computer Science
- 2022

The consistency of FSBS for multiple change-point estimation is shown and a sharp localisation error rate is provided, which reveals an interesting phase transition phenomenon depending on the number of functional curves observed and the sampling frequency for each curve.

Faster DBSCAN via subsampled similarity queries

- Computer ScienceNeurIPS
- 2020

An extensive experimental analysis is provided showing that on large datasets, one can subsample as little as $0.1\%$ of the neighborhood graph, leading to as much as over 200x speedup and 250x reduction in RAM consumption compared to scikit-learn's implementation of DBSCAN, while still maintaining competitive clustering performance.

## References

SHOWING 1-10 OF 29 REFERENCES

Adaptive Density Level Set Clustering

- Computer Science, MathematicsCOLT
- 2011

This paper presents a simple algorithm that is able to asymptotically determine the optimal level, that is, the level at which there is the rst split in the cluster tree of the data generating distribution.

Consistent Procedures for Cluster Tree Estimation and Pruning

- Computer ScienceIEEE Transactions on Information Theory
- 2014

A tree pruning procedure is studied that guarantees, under milder conditions than usual, to remove clusters that are spurious while recovering those that are salient, and derive lower bounds on the sample complexity of cluster tree estimation.

Set estimation: Another bridge between statistics and geometry.

- Mathematics
- 2009

A nonexhaustive expository overview of set estimation theory is given, which presents the basic ideas, some typical tools involved in the theory and a few applications.

$U$-Processes: Rates of Convergence

- Mathematics
- 1987

On introduit un nouveau processus stochastique, une collection de statistiques U indicees par une famille de noyaux symetriques. On obtient des conditions pour la convergence presque sure uniforme…

On boundary estimation

- MathematicsAdvances in Applied Probability
- 2004

We consider the problem of estimating the boundary of a compact set S ⊂ ℝ d from a random sample of points taken from S. We use the Devroye-Wise estimator which is a union of balls centred at the…

A plug-in approach to support estimation

- Mathematics
- 1997

We suggest a new approach, based on the use of density estimators, for the problem of estimating the (compact) support of a multivariate density. This subject (motivated in terms of pattern analysis…

Measuring Mass Concentrations and Estimating Density Contour Clusters-An Excess Mass Approach

- Mathematics
- 1995

By using empirical process theory, the so-called excess mass approach is studied. It can be applied to various statistical problems, especially in higher dimensions, such as testing for…

Single linkage clustering and continuum percolation

- Mathematics
- 1995

Suppose f is a probability density function in d dimensions, d >= 2. A single linkage a-cluster on a sample of size n from the density f is a connected component of the union of balls of volume a,…

Minimax theory of image reconstruction

- Mathematics
- 1993

Image processing is an increasingly important area of research and there exists a large variety of image reconstruction methods proposed by different authors. This book is concerned with a technique…

Smoothing of Multivariate Data: Density Estimation and Visualization

- Mathematics
- 2009

Smoothing of Multivariate Data is an excellent book for courses in multivariate analysis, data analysis, and nonparametric statistics at the upper-undergraduate and graduate levels and serves as a valuable reference for practitioners and researchers in the fields of statistics, computer science, economics, and engineering.