Spectral redemption in clustering sparse networks

  title={Spectral redemption in clustering sparse networks},
  author={Florent Krzakala and Cristopher Moore and Elchanan Mossel and Joe Neeman and Allan Sly and Lenka Zdeborov{\'a} and Pan Zhang},
  journal={Proceedings of the National Academy of Sciences},
  pages={20935 - 20940}
Significance Spectral algorithms are widely applied to data clustering problems, including finding communities or partitions in graphs and networks. We propose a way of encoding sparse data using a “nonbacktracking” matrix, and show that the corresponding spectral algorithm performs optimally for some popular generative models, including the stochastic block model. This is in contrast with classical spectral algorithms, based on the adjacency matrix, random walk matrix, and graph Laplacian… 

Figures from this paper

Spectral community detection in sparse networks

This work makes use of a relaxation method to derive a spectral community detection algorithm that works well even in the sparse regime where other methods break down, Interestingly, however, the matrix at the heart of the method is not exactly the non-backtracking matrix, but a variant of it with a somewhat different definition.

A unified framework for spectral clustering in sparse graphs

It is demonstrated that a conveniently parametrized form of regularized Laplacian matrix can be used to perform spectral clustering in sparse networks, without suffering from its degree heterogeneity.

Optimized Deformed Laplacian for Spectrum-based Community Detection in Sparse Heterogeneous Graphs

This article study spectral clustering based on the deformed Laplacian matrix $D-rA$ for sparse heterogeneous graphs (following a two-class degree-corrected stochastic block model) shows that, unlike competing methods such as the Bethe Hessian or non-backtracking operator approaches, clustering is insensitive to the graph heterogeneity.

Consistency of spectral clustering in stochastic block models

It is shown that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as $\log n$ with $n$ the number of nodes.

Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold

It is proved that with arbitrary fraction of the labels revealed, the detection problem is feasible throughout the parameter domain, and two efficient algorithms are introduced, one combinatorial and one based on optimization, to integrate label information with graph structures.

Consistency of Spectral Clustering on Hierarchical Stochastic Block Models

A recursive bi-partitioning algorithm is developed that divides the network into two communities based on the Fiedler vector of the unnormalized graph Laplacian and repeats the split until a stopping rule indicates no further community structures.

Nonbacktracking spectral clustering of nonuniform hypergraphs

This work studies spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator and proves a theorem of Ihara-Bass type to enable faster computation of eigenpairs and proposes an alternating algorithm for inference in a hypergraph stochastic blockmodel via linearized belief-propagation.

Community detection in multilayer graphs using spectral methods and core-finding

A heuristic algorithm inspired by Belief Propagation and identifying core communities is proposed and discussed, and a finite-time bound on the misclassification rate is proved.

An Adaptive Spectral Algorithm for the Recovery of Overlapping Communities in Networks

Combinatorial spectral clustering, a simple spectral algorithm designed to identify overlapping communities in networks, is presented and is shown to perform well on simulated data and on real-world graphs with known overlapping communities.

A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks

An adaptive version of the algorithm, that does not require the knowledge of the number of hidden communities, is proved to be consistent under the SBMO when the degrees in the graph are (slightly more than) logarithmic.



Pseudo-likelihood methods for community detection in large sparse networks

It is proved that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

An efficient and principled method for detecting communities in networks

This work describes a method for finding overlapping communities based on a principled statistical approach using generative network models and shows how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times.

Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications

This paper uses the cavity method of statistical physics to obtain an asymptotically exact analysis of the phase diagram of the stochastic block model, a commonly used generative model for social and biological networks, and develops a belief propagation algorithm for inferring functional groups or communities from the topology of the network.

Graph spectra and the detectability of community structure in networks

Using methods from random matrix theory, the spectra of networks that display community structure are calculated, and it is shown that spectral modularity maximization is an optimal detection method in the sense that no other method will succeed in the regime where the modularity method fails.

Spectral partitioning of random graphs

  • Frank McSherry
  • Computer Science, Mathematics
    Proceedings 2001 IEEE International Conference on Cluster Computing
  • 2001
This paper shows that a simple spectral algorithm can solve all three problems above in the average case, as well as a more general problem of partitioning graphs based on edge density.

Graph Partitioning via Adaptive Spectral Techniques

It is shown that on input (G, k) the partition V1,.

A tutorial on spectral clustering

This tutorial describes different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches.

Community structure in social and biological networks

  • M. GirvanM. Newman
  • Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 2002
This article proposes a method for detecting communities, built around the idea of using centrality indices to find community boundaries, and tests it on computer-generated and real-world graphs whose community structure is already known and finds that the method detects this known structure with high sensitivity and reliability.

Phase transition in the detection of modules in sparse networks

An asymptotically exact analysis of the problem of detecting communities in sparse random networks generated by stochastic block models using the cavity method of statistical physics and its relationship to belief propagation yields an optimal inference algorithm for detecting modules.

Scalable Inference of Overlapping Communities

A scalable algorithm for posterior inference of overlapping communities in large networks, based on stochastic variational inference in the mixed-membership Stochastic blockmodel, that converges several orders of magnitude faster than the state-of-the-art algorithm for MMSB and detects the true communities in 280 benchmark networks with equal or better accuracy.