• Corpus ID: 218684722

# Stable and consistent density-based clustering

@article{Rolle2020StableAC,
title={Stable and consistent density-based clustering},
author={Alexander Rolle and Luis Scoccola},
journal={ArXiv},
year={2020},
volume={abs/2005.09048}
}
• Published 18 May 2020
• Computer Science, Mathematics
• ArXiv
We present a consistent approach to density-based clustering, which satisfies a stability theorem that holds without any distributional assumptions. We also show that the algorithm can be combined with standard procedures to extract a flat clustering from a hierarchical clustering, and that the resulting flat clustering algorithms satisfy stability theorems. The algorithms and proofs are inspired by topological data analysis.

## Figures from this paper

### Flattening Multiparameter Hierarchical Clustering Functors

This work brings together topological data analysis, applied category theory, and machine learning to study multiparameter hierarchical clustering and introduces a Bayesian update algorithm for learning clustering parameters from data.

### Stability of 2-Parameter Persistent Homology

• Mathematics
ArXiv
• 2020
It is shown that several related density-sensitive constructions of bifiltrations from data satisfy stability properties accommodating the addition and removal of outliers, and 1-Lipschitz stability results closely analogous to the standard stability results for 1-parameter persistent homology.

### Stability for layer points

The theory of layer points is generalized to the more general context of ~v-hierarchical clusterings to consider cases where a hierarchical clustering of a finite metric space, Y, is interleaved with a hierarchical clusters of some sample X ⊆ Y.

### Locally Persistent Categories And Metric Properties Of Interleaving Distances

This thesis presents a uniform treatment of different distances used in the applied topology literature. We introduce the notion of a locally persistent category, which is a category with a notion of

### An Introduction to Multiparameter Persistence

• Mathematics
ArXiv
• 2022
In topological data analysis (TDA), one often studies the shape of data by constructing a ﬁltered topological space, whose structure is then examined using persistent homology. However, a single

### 𝓁p-Distances on Multiparameter Persistence Modules

• Mathematics
ArXiv
• 2021
It is shown that on 1or 2-parameter persistence modules over prime fields, dp I is the universal metric satisfying a natural stability property; this result extends a stability result of Skraba and Turner for the p-Wasserstein distance on barcodes in the 1- parameter case, and is also a close analogue of a universality property for the interleaving distance given by the second author.

### $\ell^p$-Distances on Multiparameter Persistence Modules

• Mathematics
• 2021
It is shown that on 1or 2-parameter persistence modules over prime fields, dp I is the universal metric satisfying a natural stability property; this result extends a stability result of Skraba and Turner for the p-Wasserstein distance on barcodes in the 1- parameter case, and is also a close analogue of a universality property for the interleaving distance given by the second author.

### Characterization of Gromov-type geodesics

• Mathematics
• 2021
Classical results due to Gromov and to Petersen establish that, when endowed with the Gromov-Hausdorff distance dGH, the collection M of all isometry classes of compact metric spaces is a complete

### Interleaving by Parts: Join Decompositions of Interleavings and Join-Assemblage of Geodesics

• Mathematics
• 2019
Metrics of interest in topological data analysis (TDA) are often explicitly or implicitly in the form of an interleaving distance d I between poset maps (i.e. order-preserving maps), e.g. the

### Rectification of interleavings and a persistent Whitehead theorem

• Mathematics
• 2020
The homotopy interleaving distance, a distance between persistent spaces, was introduced by Blumberg and Lesnick and shown to be universal, in the sense that it is the largest homotopy-invariant

## References

SHOWING 1-10 OF 37 REFERENCES

### Density-Based Clustering Based on Hierarchical Density Estimates

• Computer Science
PAKDD
• 2013
This work proposes a theoretically and practically improved density-based, hierarchical clustering method, providing a clustering hierarchy from which a simplified tree of significant clusters can be constructed, and proposes a novel cluster stability measure.

### Generalized density clustering

• Computer Science
• 2010
We study generalized density-based clustering in which sharply defined clusters such as clusters on lower-dimensional manifolds are allowed. We show that accurate clustering is possible even in high

### Characterization, Stability and Convergence of Hierarchical Clustering Methods

• Mathematics
J. Mach. Learn. Res.
• 2010
It is shown that within this framework, one can prove a theorem analogous to one of Kleinberg (2002), in which one obtains an existence and uniqueness theorem instead of a non-existence result.

### Accelerated Hierarchical Density Based Clustering

• Computer Science, Physics
2017 IEEE International Conference on Data Mining Workshops (ICDMW)
• 2017
The accelerated HDBSCAN* algorithm provides comparable performance to DBSCAN, while supporting variable density clusters, and eliminating the need for the difficult to tune distance scale parameter epsilon, making it the default choice for density based clustering.

### Rates of convergence for the cluster tree

• Computer Science, Mathematics
NIPS
• 2010
Finite-sample convergence rates for the algorithm and lower bounds on the sample complexity of this estimation problem are given.

### Persistence-Based Clustering in Riemannian Manifolds

• Computer Science
JACM
• 2013
A clustering scheme that combines a mode-seeking phase with a cluster merging phase in the corresponding density map, and whose output clusters have the property that their spatial locations are bound to the ones of the basins of attraction of the peaks of the density.

### Multiparameter Hierarchical Clustering Methods

• Computer Science, Mathematics
• 2010
This work proposes an extension of hierarchical clustering methods, called multiparameter hierarchical clustered methods which are designed to exhibit sensitivity to density while retaining desirable theoretical properties, and presents both a characterization and a stability theorem.

### Consistency of Single Linkage for High-Density Clusters

Abstract High-density clusters are defined on a population with density f in r dimensions to be the maximal connected sets of form {x | f(x) ≥ c}. Single-linkage clustering is evaluated for

### A Generalized Single Linkage Method for Estimating the Cluster Tree of a Density

• Computer Science
• 2010
A graph-based method is presented that can approximate the cluster tree of any density estimate and proposes excess mass as a measure for the size of a branch, reflecting the height of the corresponding peak of the density above the surrounding valley floor as well as its spatial extent.

### Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering

• Computer Science
COLT
• 2015
Two limit properties, separation and minimality, are identified, which address both over-segmentation and improper nesting and together imply (but are not implied by) Hartigan consistency.