• Corpus ID: 6437571

Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms

@article{Abbe2015CommunityDI,
  title={Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms},
  author={Emmanuel Abbe and Colin Sandon},
  journal={ArXiv},
  year={2015},
  volume={abs/1503.00609}
}
New phase transition phenomena have recently been discovered for the stochastic block model, for the special case of two non-overlapping symmetric communities. This gives raise in particular to new algorithmic challenges driven by the thresholds. This paper investigates whether a general phenomenon takes place for multiple communities, without imposing symmetry. In the general stochastic block model $\text{SBM}(n,p,Q)$, $n$ vertices are split into $k$ communities of relative size $\{p_i\}_{i… 

Figures from this paper

Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery
  • E. Abbe, Colin Sandon
  • Computer Science
    2015 IEEE 56th Annual Symposium on Foundations of Computer Science
  • 2015
TLDR
This paper investigates the partial and exact recovery of communities in the general SBM (in the constant and logarithmic degree regimes), and uses the generality of the results to tackle overlapping communities.
Density Evolution in the Degree-correlated Stochastic Block Model
TLDR
This paper addresses the more refined question of how many vertices that will be misclassified on average under the stochastic block model, and shows that the minimum misclassified fraction on average is attained by a local algorithm, namely belief propagation, in time linear in the number of edges.
Community detection and stochastic block models: recent developments
  • E. Abbe
  • Computer Science
    J. Mach. Learn. Res.
  • 2017
TLDR
The recent developments that establish the fundamental limits for community detection in the stochastic block model are surveyed, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery.
Community Detection and Stochastic Block Models
  • E. Abbe
  • Computer Science
    Found. Trends Commun. Inf. Theory
  • 2018
TLDR
The recent developments that establish the fundamental limits for community detection in the stochastic block model are surveyed, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery.
Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap
TLDR
The paper proves the efficient detection to non-symmetrical SBMs with a generalized notion of detection and KS threshold, and connects ABP to a power iteration method with a nonbacktracking operator of generalized order, formalizing the interplay between message passing and spectral methods.
Side Information in the Binary Stochastic Block Model: Exact Recovery
TLDR
An efficient algorithm that incorporates the effect of side information is proposed that uses a partial recovery algorithm combined with a local improvement procedure and sufficient conditions are derived for exact recovery under this efficient algorithm.
Minimax Rates of Community Detection in Stochastic Block Models
TLDR
A general minimax theory for community detection is provided, which gives minimax rates of the mis-match ratio for a wide rage of settings including homogeneous and inhomogeneous SBMs, dense and sparse networks, finite and growing number of communities.
Relative Density and Exact Recovery in Heterogeneous Stochastic Block Models
TLDR
It is shown that it is possible, in the right circumstances, to recover very small clusters (up to $\sqrt{\log n}$ size), if there are just a few of them (at most polylogarithmic in $n$).
Multisection in the Stochastic Block Model using Semidefinite Programming
TLDR
It is shown that a certain natural SDP-based algorithm solves the problem of exact recovery in the k-community SBM, with high probability, whenever \(\sqrt {\alpha } - \sqRT {\beta } > \sqrt {1}\), as long as \(k=o(\log n)\).
How robust are reconstruction thresholds for community detection?
TLDR
It is shown that the viewpoint of semirandom models can help explain why some algorithms are preferred to others in practice, in spite of the gaps in their statistical performance on random models, and that algorithms based on semidefinite programming are robust in ways that any algorithm meeting the information-theoretic threshold cannot be.
...
...

References

SHOWING 1-10 OF 78 REFERENCES
Exact Recovery in the Stochastic Block Model
TLDR
An efficient algorithm based on a semidefinite programming relaxation of ML is proposed, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest that it may achieve the threshold.
Decoding Binary Node Labels from Censored Edge Measurements: Phase Transition and Efficient Recovery
TLDR
The first goal of this paper is to determine how the edge probabilityp needs to scale to allow exact recovery in the presence of noise and an efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate.
Community detection thresholds and the weak Ramanujan property
TLDR
This work proves that for logarithmic length ℓ, the leading eigenvectors of this modified matrix B provide a non-trivial reconstruction of the underlying structure, thereby settling the conjecture of a sharp threshold on model parameters for community detection in sparse random graphs drawn from the stochastic block model.
Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery
TLDR
A simple and robust spectral algorithm for the stochastic block model with blocks having constant edge density, under an optimal condition on the gap between the density inside a block and the density between the blocks.
Linear inverse problems on Erdős-Rényi graphs: Information-theoretic limits and efficient recovery
TLDR
An efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate.
Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications
TLDR
This paper uses the cavity method of statistical physics to obtain an asymptotically exact analysis of the phase diagram of the stochastic block model, a commonly used generative model for social and biological networks, and develops a belief propagation algorithm for inferring functional groups or communities from the topology of the network.
Consistency Thresholds for Binary Symmetric Block Models
TLDR
This work considers the problem of reconstructing symmetric block models with two blocks of n vertices each and connection probabilities pn and qn for interand intra-block edge probabilities respectively and gives efficient algorithms for consistent estimators whenever one exists.
Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results
TLDR
This work proposes a computationally efficient spectral algorithm that allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes and shows that no algorithm can achieve better inference than guessing without using the observations.
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
TLDR
It is shown that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one in the binary symmetric stochastic block model.
Community Detection in the Labelled Stochastic Block Model
TLDR
It is proved that the given threshold correctly identifies a transition on the behaviour of belief propagation from insensitive to sensitive, and that the same threshold corresponds to the transition in a related inference problem on a tree model from infeasible to feasible.
...
...