Corpus ID: 7649493

Robust and efficient multi-way spectral clustering

  title={Robust and efficient multi-way spectral clustering},
  author={Anil Damle and Victor Minden and Lexing Ying},
We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct, and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that… Expand
Clustering of graph vertex subset via Krylov subspace model reduction
Two novel algorithms for spectral clustering of a subset of the graph vertices (target subset) based on the theory of model order reduction rely on realizations of a reduced order model (ROM) that accurately approximates the diffusion transfer function of the original graph for inputs and outputs restricted to the target subset. Expand
An Automated Spectral Clustering for Multi-scale Data
These findings show that iterative eigengap search with a PCA-based global scaling scheme can discover different patterns with an accuracy of higher than 90% in most cases without asking for a priori input information. Expand
Spectral Embedding Norm: Looking Deep into the Spectrum of the Graph Laplacian
The spectral embedding norm which sums the squared values of the first I normalized eigenvectors, where I can be significantly larger than K, is proposed and it is proved that this quantity can be used to separate clusters from the background in unbalanced settings, including extreme cases such as outlier detection. Expand
Distance Preserving Model Order Reduction of Graph-Laplacians and Cluster Analysis
Two novel algorithms are developed that ensure that target subset clustering is consistent with the spectral clustering of the full data set if one would perform such and a properly parametrized reduced-order model of the graph-Laplacian that approximates accurately the diffusion transfer function of the original graph for inputs and outputs restricted to the target subset. Expand
Preconditioned spectral clustering for stochastic block partition streaming graph challenge (Preliminary version at arXiv.)
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is demonstrated to efficiently solve eigenvalue problems for graph Laplacians that appear in spectral clustering to achieve 100% correctness for all Challenge datasets with known truth partitions. Expand
Investigation of Spectral Clustering for Signed Graph Matrix Representations
It is concluded that the Gremban's expansion is the most robust representation to use for spectral clustering since it provides higher quality clusters for more combinations of inner and outer-connection probabilities for positive and negative valued edges. Expand
Preconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge /Author=Zhuzhunashvili, D.; Knyazev, A. /CreationDate=September 25, 2017 /Subject=Signal Processing
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is demonstrated to efficiently solve eigenvalue problems for graph Laplacians that appear in spectral clustering. For static graphExpand
Stabilizing discrete empirical interpolation via randomized and deterministic oversampling
This work investigates randomized and deterministic oversampling in (discrete) empirical interpolation for nonlinear model reduction. Empirical interpolation derives approximations of nonlinear termsExpand
Graph-based Semi-Supervised & Active Learning for Edge Flows
A graph-based semi-supervised learning method for learning edge flows defined on a graph that derives bounds for the method's reconstruction error and provides two active learning algorithms for selecting informative edges on which to measure flow, which has applications for optimal sensor deployment. Expand
Fast algorithm for quantum polar decomposition, pretty-good measurements, and the Procrustes problem
The polar decomposition of a matrix is a key element in the quantum linear algebra toolbox. We show that the problem of quantum polar decomposition, recently studied in Lloyd et al. [1], has a simpleExpand


Spectral Relaxation for K-means Clustering
It is shown that a relaxed version of the trace maximization problem possesses global optimal solutions which can be obtained by Computing a partial eigendecomposition of the Gram matrix, and the cluster assignment for each data vectors can be found by computing a pivoted QR decomposition ofThe eigenvector matrix. Expand
On Spectral Clustering: Analysis and an algorithm
A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well. Expand
A tutorial on spectral clustering
This tutorial describes different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Expand
Spectral redemption in clustering sparse networks
A way of encoding sparse data using a “nonbacktracking” matrix, and it is shown that the corresponding spectral algorithm performs optimally for some popular generative models, including the stochastic block model. Expand
A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm
It is demonstrated that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods, and eight commonly used linear time complexity initialization methods are compared. Expand
The geometry of kernelized spectral clustering
Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the topExpand
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
It is shown that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one in the binary symmetric stochastic block model. Expand
On Rank-Revealing Factorisations
The problem of finding a rank-revealing QR (RRQR) factorisation of a matrix $A$ consists of permuting the columns of $A$ such that the resulting QR factorisation contains an upper triangular matrixExpand
Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization
Two algorithms are presented for computing rank-revealing QR factorizations that are nearly as efficient as QR with column pivoting for most problems and take O (ran2) floating-point operations in the worst case. Expand
Spectral clustering and the high-dimensional stochastic blockmodel
Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social ne tworks, representing people who communicate with each other, areExpand