Consistency of Anchor-based Spectral Clustering

@article{Kergorlay2021ConsistencyOA,
  title={Consistency of Anchor-based Spectral Clustering},
  author={Henry-Louis de Kergorlay and Desmond J. Higham},
  journal={ArXiv},
  year={2021},
  volume={abs/2006.13984}
}
Anchor-based techniques reduce the computational complexity of spectral clustering algorithms. Although empirical tests have shown promising results, there is currently a lack of theoretical support for the anchoring approach. We define a specific anchor-based algorithm and show that it is amenable to rigorous analysis, as well as being effective in practice. We establish the theoretical consistency of the method in an asymptotic setting where data is sampled from an underlying continuous… 

Figures and Tables from this paper

Spectral analysis of weighted Laplacians arising in data clustering

Summary: Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning

TLDR
This paper investigates how the spectral gap depends on the three parameters entering the graph Laplacian, and on a parameter measuring the size of the perturbation from the perfectly clustered case, and provides insight into parameter choices made in learning algorithms which are based on weighted adjacency matrices.

Generative Hypergraph Models and Spectral Embedding

TLDR
It is shown that the hypergraph approach can outperform clustering algorithms that use only dyadic edges and compare several triadic edge prediction methods on high school contact data where the algorithm improves upon benchmark methods when the amount of training data is limited.

References

SHOWING 1-10 OF 39 REFERENCES

Fast approximate spectral clustering

TLDR
This work develops a general framework for fast approximate spectral clustering in which a distortion-minimizing local transformation is first applied to the data, and develops two concrete instances of this framework, one based on local k-means clustering (KASP) and onebased on random projection trees (RASP).

Large Scale Spectral Clustering with Landmark-Based Representation

TLDR
This paper proposes a novel approach, called Landmark-based Spectral Clustering (LSC), for large scale clustering problems, where the original data points are represented as the linear combinations of landmarks and the spectral embedding of the data can be efficiently computed with the landmark-based representation.

Consistency of spectral clustering

TLDR
It is proved that one of the two major classes of spectral clustering (normalized clustering) converges under very general conditions, while the other is only consistent under strong additional assumptions, which are not always satisfied in real data.

A variational approach to the consistency of spectral clustering

Parallel Spectral Clustering in Distributed Systems

TLDR
This work investigates two representative ways of approximating the dense similarity matrix and picks the strategy of sparsifying the matrix via retaining nearest neighbors and investigates its parallelization, which can effectively handle large problems.

Information theoretic measures for clusterings comparison: is a correction for chance necessary?

TLDR
This paper derives the analytical formula for the expected mutual information value between a pair of clusterings, and proposes the adjusted version for several popular information theoretic based measures.

A tutorial on spectral clustering

TLDR
This tutorial describes different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches.

Spectral clustering and its use in bioinformatics

Understanding Regularized Spectral Clustering via Graph Conductance

TLDR
The results show that unbalanced partitions from spectral clustering can be understood as overfitting to noise in the periphery of a sparse and stochastic graph and demonstrate how regularization can improve the computational speed of spectral clusters.

Spectral analysis of weighted Laplacians arising in data clustering