• Corpus ID: 237091775

Confidence sets in a sparse stochastic block model with two communities of unknown sizes

@inproceedings{Kleijn2021ConfidenceSI,
  title={Confidence sets in a sparse stochastic block model with two communities of unknown sizes},
  author={B. Kleijn and Jan van Waaij},
  year={2021}
}
In a sparse stochastic block model with two communities of unequal sizes we derive two posterior concentration inequalities, that imply (1) posterior (almost-)exact recovery of the community structure under sparsity bounds comparable to well-known sharp bounds in the planted bi-section model; (2) a construction of confidence sets for the community assignment from credible sets, with finite graph sizes. The latter enables exact frequentist uncertain quantification with Bayesian credible sets at… 
1 Citations

Figures from this paper

Consistent Bayesian community recovery in multilayer networks

TLDR
A simulation study shows that the derived bounds translate to classification accuracy that improves as the number of observed layers increases, comparable to a well known threshold for community recovery by a single-layer stochastic block model.

References

SHOWING 1-10 OF 30 REFERENCES

Recovery, detection and confidence sets of communities in a sparse stochastic block model

Posterior distributions for community assignment in the planted bi-section model are shown to achieve frequentist exact recovery and detection under sharp lower bounds on sparsity. Assuming posterior

Uncertainty quantification and testing in a stochastic block model with two unequal communities

Abstract: We show posterior convergence for the community structure in the planted bi-section model, for several interesting priors. Examples include where the label on each vertex is iid Bernoulli

Exact Recovery in the Stochastic Block Model

TLDR
An efficient algorithm based on a semidefinite programming relaxation of ML is proposed, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest that it may achieve the threshold.

Consistent Bayesian Community Detection

TLDR
This paper studies a special class of SBMs whose communitywise connectivity probability matrix is diagonally dominant, i.e., members of the same community are more likely to connect with one another than with members from other communities.

Minimax Rates of Community Detection in Stochastic Block Models

TLDR
A general minimax theory for community detection is provided, which gives minimax rates of the mis-match ratio for a wide rage of settings including homogeneous and inhomogeneous SBMs, dense and sparse networks, finite and growing number of communities.

Frequentist validity of Bayesian limits

  • B. Kleijn
  • Mathematics
    The Annals of Statistics
  • 2021
To the frequentist who computes posteriors, not all priors are useful asymptotically: in this paper Schwartz's 1965 Kullback-Leibler condition is generalised to enable frequentist interpretation of

Achieving Exact Cluster Recovery Threshold via Semidefinite Programming

TLDR
It is shown that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one in the binary symmetric stochastic block model.

Pseudo-likelihood methods for community detection in large sparse networks

TLDR
It is proved that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

Stochastic blockmodels with growing number of classes

TLDR
It is shown that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size.

Probabilistic Community Detection With Unknown Number of Communities

TLDR
A coherent probabilistic framework for simultaneous estimation of the number of communities and the community structure is proposed, adapting recently developed Bayesian nonparametric techniques to network models and developed concentration properties of nonlinear functions of Bernoulli random variables.