• Corpus ID: 233476172

Flattening Multiparameter Hierarchical Clustering Functors

  title={Flattening Multiparameter Hierarchical Clustering Functors},
  author={Dan Shiebler},
We bring together topological data analysis, applied category theory, and machine learning to study multiparameter hierarchical clustering. We begin by introducing a procedure for flattening multiparameter hierarchical clusterings. We demonstrate that this procedure is a functor from a category of multiparameter hierarchical partitions to a category of binary integer programs. We also include empirical results demonstrating its effectiveness. Next, we introduce a Bayesian update algorithm for… 

Figures from this paper

Category Theory in Machine Learning

This work aims to document the motivations, goals and common themes across these applications of category theory in machine learning, touching on gradient-based learning, probability, and equivariant learning.



Consistency constraints for overlapping data clustering

This work examines overlapping clustering schemes with functorial constraints, in the spirit of Carlsson--Memoli, and shows that any clustering functor is naturally constrained to refine single- linkage clusters and be refined by maximal-linkage clusters.

Persistent Clustering and a Theorem of J. Kleinberg

It is shown that within this framework, one can prove a theorem analogous to one of J. Kleinberg, in which one obtains an existence and uniqueness theorem instead of a non-existence result.

Persistent Clustering and a Theorem of J .

This work constructs a framework for studying clustering algorithms, which includes two key ideas: persistence and functoriality, and shows that within this framework, one can prove a theorem analogous to one of J. Kleinberg, in which one obtains an existence and uniqueness theorem instead of a non-existence result.

Stable and consistent density-based clustering

We present a consistent approach to density-based clustering, which satisfies a stability theorem that holds without any distributional assumptions. We also show that the algorithm can be combined

Persistence-Based Clustering in Riemannian Manifolds

A clustering scheme that combines a mode-seeking phase with a cluster merging phase in the corresponding density map, and whose output clusters have the property that their spatial locations are bound to the ones of the basins of attraction of the peaks of the density.

Rates of convergence for the cluster tree

Finite-sample convergence rates for the algorithm and lower bounds on the sample complexity of this estimation problem are given.

Objective Criteria for the Evaluation of Clustering Methods

This article proposes several criteria which isolate specific aspects of the performance of a method, such as its retrieval of inherent structure, its sensitivity to resampling and the stability of its results in the light of new data.

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits.

NewsWeeder: Learning to Filter Netnews

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing