hdbscan: Hierarchical density based clustering

  title={hdbscan: Hierarchical density based clustering},
  author={Leland McInnes and John Healy and S. Astels},
  journal={J. Open Source Softw.},
HDBSCAN: Hierarchical Density-Based Spatial Clustering of Applications with Noise (Campello, Moulavi, and Sander 2013), (Campello et al. 2015. [...] Key Method The library also includes support for Robust Single Linkage clustering (Chaudhuri et al. 2014), (Chaudhuri and Dasgupta 2010), GLOSH outlier detection (Campello et al. 2015), and tools for visualizing and exploring cluster structures. Finally support for prediction and soft clustering is also available.Expand
HDBSCAN(ε̂): An Alternative Cluster Extraction Method for HDBSCAN
This work proposes an alternative method for selecting clusters from the HDBSCAN hierarchy that is particularly useful for data sets with variable densities where they require a low minimum cluster size but want to avoid an abundance of micro-clusters in high-density regions. Expand
HDBSCAN($\hat{\epsilon}$): An Alternative Cluster Extraction Method for HDBSCAN
This work proposes an alternative method for selecting clusters from the HDBSCAN hierarchy that uses an additional input parameter $\hat{\epsilon$ and acts like a hybrid between DBSCAN* andHDBSCAN. Expand
FISHDBC: Flexible, Incremental, Scalable, Hierarchical Density-Based Clustering for Arbitrary Data and Distance
FISHDBC is a flexible, incremental, scalable, and hierarchical density-based clustering algorithm that approximates HDBSCAN*, an evolution of DBSCAN. Expand
DenMune: Density peak based clustering using mutual nearest neighbors
A novel clustering algorithm based on identifying dense regions using mutual nearest neighborhoods of size K, where K is the only parameter required from the user, besides obeying the mutual nearest neighbor consistency principle, that produces robust results on various low and high dimensional datasets relative to several known state of the art clustering algorithms. Expand
Accelerated Hierarchical Density Based Clustering
  • Leland McInnes, John Healy
  • Mathematics, Computer Science
  • 2017 IEEE International Conference on Data Mining Workshops (ICDMW)
  • 2017
The accelerated HDBSCAN* algorithm provides comparable performance to DBSCAN, while supporting variable density clusters, and eliminating the need for the difficult to tune distance scale parameter epsilon, making it the default choice for density based clustering. Expand
Chameleon 2
This work proposes an improved graph-based clustering algorithm called Chameleon 2, which overcomes several drawbacks of state-of-the-art clustering approaches, and modified the internal cluster quality measure and added an extra step to ensure algorithm robustness. Expand
Condorcet Optimal Clustering with Delaunay Triangulation: Climate Zones and World Happiness Insights
A novel modification toCondorcet clustering methods is proposed, which improves it significantly on both accounts and works particularly well when applied to social network type data sets. Expand
Finding landmarks within settled areas using hierarchical density-based clustering and meta-data from publicly available images
Two novel density-based clustering algorithms that can be applied to solve the process of determining relevant landmarks within a certain region are presented: K-DBSCAN, a clustering algorithm based on Gaussian Kernels used to detect individual inhabited cores within regions; and V-D BSCAN, an hierarchical algorithm suitable for sample spaces with variable density, which is used to attempt the discovery of relevant landmarks in cities or regions. Expand
Clustering tendency assessment for datasets having inter-cluster density variations
  • Dheeraj Kumar, J. Bezdek
  • Computer Science
  • 2020 International Conference on Signal Processing and Communications (SPCOM)
  • 2020
Numerical experiments comparing the proposed novel approach with baseline VAT/iVAT as well as spectral clustering and density-based clustering algorithms establish that LS-VAT and LS- iVAT are superior to the comparable algorithms in terms of clustering quality. Expand
AMTICS: Aligning Micro-clusters to Identify Cluster Structures
AMTICS is developed as a novel and efficient divide-and-conquer approach to pre-cluster data in distributed instances and align the results in a hierarchy afterward. Expand


Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection
An integrated framework for density-based cluster analysis, outlier detection, and data visualization is introduced, consisting of an algorithm to compute hierarchical estimates of the level sets of a density, following Hartigan’s classic model of density-contour clusters and trees. Expand
Density-Based Clustering Based on Hierarchical Density Estimates
This work proposes a theoretically and practically improved density-based, hierarchical clustering method, providing a clustering hierarchy from which a simplified tree of significant clusters can be constructed, and proposes a novel cluster stability measure. Expand
Consistent Procedures for Cluster Tree Estimation and Pruning
A tree pruning procedure is studied that guarantees, under milder conditions than usual, to remove clusters that are spurious while recovering those that are salient, and derive lower bounds on the sample complexity of cluster tree estimation. Expand
Rates of convergence for the cluster tree
Finite-sample convergence rates for the algorithm and lower bounds on the sample complexity of this estimation problem are given. Expand
hdbscan: Hierarchical density based clustering
HDBSCAN performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over ePSilon, which allows HDBSCAN to find clusters of varying densities, and be more robust to parameter selection. Expand