Global and Local Information in Clustering Labeled Block Models

@article{Kanade2016GlobalAL,
  title={Global and Local Information in Clustering Labeled Block Models},
  author={Varun Kanade and Elchanan Mossel and Tselil Schramm},
  journal={IEEE Transactions on Information Theory},
  year={2016},
  volume={62},
  pages={5906-5917}
}
The stochastic block model is a classical cluster exhibiting random graph model that has been widely studied in statistics, physics, and computer science. In its simplest form, the model is a random graph with two equal-sized clusters, with intracluster edge probability p, and intercluster edge probability q. We focus on the sparse case, i.e., p, q = O(1/n), which is practically more relevant and also mathematically more challenging. A conjecture of Decelle, Krzakala, Moore, and Zdeborova… 

Figures from this paper

Active learning for community detection in stochastic block models

TLDR
This work shows that sampling the labels of a vanishingly small fraction of nodes is sufficient for exact community detection even when D(a; b) <; 1, and provides an efficient learning algorithm which recovers the community memberships of all nodes w.p. as long as the number of sampled points meets the sufficient condition.

Find Your Place: Simple Distributed Algorithms for Community Detection

TLDR
It is proved that the process resulting from this dynamics produces a clustering that exactly or approximately reflects the underlying cut in logarithmic time, under various graph models that exhibit a sparse balanced cut, including the stochastic block model.

Recovering asymmetric communities in the stochastic block model

TLDR
This work considers the sparse stochastic block model in the case where the degrees are uninformative and shows that if the community of a vanishing fraction of the vertices is revealed, then a local algorithm is optimal down to Kesten Stigum threshold and quantifies explicitly its performance.

The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness

TLDR
Community detection in graphs is the problem of finding groups of vertices which are more densely connected than they are to the rest of the graph, which provides a window into the cultures of statistical physics and statistical inference, and how those cultures think about distributions of instances, landscapes of solutions, and hardness.

Contextual Stochastic Block Model: Sharp Thresholds and Contiguity

TLDR
It is expected that the conjecture holds as soon as the average degree exceeds one, so that the graph has a giant component, and the sharp threshold for detection and weak recovery is characterized.

Streaming Belief Propagation for Community Detection

TLDR
This work introduces a simple model for networks growing over time which it is referred to as streaming stochastic block model (StSBM) and proves that voting algorithms have fundamental limitations, and develops a streaming belief-propagation approach which proves optimality in certain regimes.

Mutual information for the sparse stochastic block model

TLDR
A conjecture for the limit of this quantity is expressed in terms of a Hamilton-Jacobi equation posed over a space of probability measures, and a proof that this conjectured limit provides a lower bound for the asymptotic mutual information is shown.

Ising Model on Locally Tree-like Graphs: Uniqueness of Solutions to Cavity Equations

TLDR
This work proves there is at most at most one non-trivial fixed point for Ising models with zero or random (but “unbised”) external fields.

Information Limits for Community Detection in Hypergraph with Label Information

TLDR
This work investigating the effect of label information on the exact recovery of communities in an m-uniform Hypergraph Stochastic Block Model (HSBM) derives sharp boundaries for exact recovery under both scenarios from an information-theoretical point of view.

Density Evolution in the Degree-correlated Stochastic Block Model

TLDR
This paper addresses the more refined question of how many vertices that will be misclassified on average under the stochastic block model, and shows that the minimum misclassified fraction on average is attained by a local algorithm, namely belief propagation, in time linear in the number of edges.

References

SHOWING 1-10 OF 52 REFERENCES

A Proof of the Block Model Threshold Conjecture

TLDR
This work proves the rest of the conjecture of Decelle, Krzkala, Moore and Zdeborová by providing an efficient algorithm for clustering in a way that is correlated with the true partition when s2>d.

Phase transitions in semisupervised clustering of sparse networks

TLDR
This work uses the cavity method and the associated belief propagation algorithm to study what accuracy can be achieved as a function of α, and finds that the detectability transition disappears for any α>0, in agreement with previous work.

Limits of local algorithms over sparse random graphs

TLDR
This result is the first one where the clustering property is used to formally prove limits on local algorithms, and shows that typically every two large independent sets in a random graph either have a significant intersection, or have a nearly empty intersection.

Community detection with and without prior information

TLDR
The impact of the prior information on the detection threshold is studied, and it is shown that even minute (but generic) values of ρ>0 shift the threshold downwards to its lowest possible value.

Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications

TLDR
This paper uses the cavity method of statistical physics to obtain an asymptotically exact analysis of the phase diagram of the stochastic block model, a commonly used generative model for social and biological networks, and develops a belief propagation algorithm for inferring functional groups or communities from the topology of the network.

Community detection thresholds and the weak Ramanujan property

TLDR
This work proves that for logarithmic length ℓ, the leading eigenvectors of this modified matrix B provide a non-trivial reconstruction of the underlying structure, thereby settling the conjecture of a sharp threshold on model parameters for community detection in sparse random graphs drawn from the stochastic block model.

Phase transitions in community detection: A solvable toy model

TLDR
A semisupervised setting where the correct labels for a fraction ρ of the nodes are given and a regime analogous to the “hard but detectable” phase, where the community structure can be recovered, but only when the initial messages are sufficiently accurate is studied.

Survey: Information Flow on Trees

  • Elchanan Mossel
  • Computer Science, Mathematics
    Graphs, Morphisms and Statistical Physics
  • 2001
TLDR
This paper surveys developments and challenges related to this problem of a tree network T, where each edge acts as an independent copy of a given channel M, and information is propagated from the root.

Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure

TLDR
A posteriori blockmodeling for graphs is proposed and it is shown that when the number of vertices tends to infinity while the probabilities remain constant, the block structure can be recovered correctly with probability tending to 1.

The Metropolis Algorithm for Graph Bisection

...