• Corpus ID: 7123900

Consistency constraints for overlapping data clustering

@article{Culbertson2016ConsistencyCF,
  title={Consistency constraints for overlapping data clustering},
  author={Jared Culbertson and Dan P. Guralnik and Jakob Hansen and Peter F. Stiller},
  journal={ArXiv},
  year={2016},
  volume={abs/1608.04331}
}
We examine overlapping clustering schemes with functorial constraints, in the spirit of Carlsson--Memoli. This avoids issues arising from the chaining required by partition-based methods. Our principal result shows that any clustering functor is naturally constrained to refine single-linkage clusters and be refined by maximal-linkage clusters. We work in the context of metric spaces with non-expansive maps, which is appropriate for modeling data processing which does not increase information… 

Figures from this paper

Flattening Multiparameter Hierarchical Clustering Functors

TLDR
This work brings together topological data analysis, applied category theory, and machine learning to study multiparameter hierarchical clustering and introduces a Bayesian update algorithm for learning clustering parameters from data.

Hypergraph Co-Optimal Transport: Metric and Categorical Properties

TLDR
By enriching a hypergraph with probability measures on its nodes and hyperedges, as well as relational information capturing local and global structures, this paper obtains a general and robust framework for studying the collection of all hypergraphs.

Functorial Manifold Learning

TLDR
This work first characterize manifold learning algorithms as functors that map pseudometric spaces to optimization objectives and that factor through hierarchical clustering functors, then uses this characterization to prove refinement bounds on manifold learning loss functions and construct a hierarchy of manifoldlearning algorithms based on their equivariants.

Category Theory in Machine Learning

TLDR
This work aims to document the motivations, goals and common themes across these applications of category theory in machine learning, touching on gradient-based learning, probability, and equivariant learning.

Chase: Control of Heterogeneous Autonomous Sensors for Situational Awareness

TLDR
The overarching goal throughout the six years of the project's existence remained the discovery and analysis of new foundational methodology for information collection and fusion that exercises rigorous feedback control over information collection assets, simultaneously managing information and physical aspects of their states.

Functorial Manifold Learning and Overlapping Clustering

TLDR
A unified functorial perspective on manifold learning and clustering is developed and several state of the art manifold learning algorithms are expressed as functors at different levels of this hierarchy, including Laplacian Eigenmaps, Metric Multidimensional Scaling, and UMAP.

Functorial Clustering via Simplicial Complexes

TLDR
This work introduces the maximal and single linkage clustering algorithms as the respective composition of the flagification and connected components functors with a finite singular set functor and demonstrates that all other hierarchical overlapping clustering functors are refined by maximal linkage and refine single linkage.

References

SHOWING 1-10 OF 28 REFERENCES

An Impossibility Theorem for Clustering

TLDR
A formal perspective on the difficulty in finding a unified framework for reasoning about clustering at a technical level is suggested, in the form of an impossibility theorem: for a set of three simple properties, it is shown that there is no clustering function satisfying all three.

Characterization, Stability and Convergence of Hierarchical Clustering Methods

TLDR
It is shown that within this framework, one can prove a theorem analogous to one of Kleinberg (2002), in which one obtains an existence and uniqueness theorem instead of a non-existence result.

Enhanced Topology-Sensitive Clustering by Reeb Graph Shattering

TLDR
Preliminary experimental results are provided to demonstrate that the improved topology-sensitive clustering algorithm yields a more accurate and reliable description of the topology of the underlying scalar function.

One-to-One Correspondence Between Indexed Cluster Structures and Weakly Indexed Closed Cluster Structures

We place ourselves in a setting where singletons are not all required to be clusters, and we show that the resulting cluster structures and their corresponding closure under finite nonempty

Classifying Clustering Schemes

TLDR
A framework is constructed for studying what happens when one imposes various structural conditions on the clustering schemes, under the general heading of functoriality, and it is shown that, within this framework, one can prove a theorem analogous to one of Kleinberg (Becker et al).

Persistence-Based Clustering in Riemannian Manifolds

TLDR
A clustering scheme that combines a mode-seeking phase with a cluster merging phase in the corresponding density map, and whose output clusters have the property that their spatial locations are bound to the ones of the basins of attraction of the peaks of the density.

Combinatorial optimisation and hierarchical classifications

TLDR
Within the galaxy of optimization, some selected topics relating Combinatorial Optimization and Hierarchical Classification are discussed, including NP-completeness results and search for polynomial instances, and some standard algorithmic approaches are discussed.

The Construction of Hierarchic and Non-Hierarchic Classifications

TLDR
A theoretical framework within which the properties of cluster methods, which operate on data in the form of a dissimilarity coefficient on a set of objects, may be discussed is outlined.

Weak Hierarchies: A Central Clustering Structure

The k-weak hierarchies, for k ≥ 2, are the cluster collections such that the intersection of any (k + 1) members equals the intersection of some k of them. Any cluster collection turns out to be a