Community detection with a subsampled semidefinite program

  title={Community detection with a subsampled semidefinite program},
  author={Pedro Abdalla and Afonso S. Bandeira},
  journal={Sampling Theory, Signal Processing, and Data Analysis},
  • Pedro AbdallaA. Bandeira
  • Published 2 February 2021
  • Computer Science
  • Sampling Theory, Signal Processing, and Data Analysis
Semidefinite programming is an important tool to tackle several problems in data science and signal processing, including clustering and community detection. However, semidefinite programs are often slow in practice, so speed up techniques such as sketching are often considered. In the context of community detection in the stochastic block model, Mixon and Xie (IEEE Trans Inform Theory 67(10): 6832–6840, 2021) have recently proposed a sketching framework in which a semidefinite program is… 

Sketch-and-Lift: Scalable Subsampled Semidefinite Program for K-means Clustering

The proposed sketch-and-lift (SL) approach solves an SDP on a subsampled dataset and then propagates the solution to all data points by a nearest-centroid rounding procedure and is comparable to the original K -means SDP with substantially reduced runtime.



Sketching Semidefinite Programs for Faster Clustering

This paper shows how to sketch a popular semidefinite relaxation of a graph clustering problem known as minimum bisection, and their analysis supports a meta-claim that the clustering task is less computationally burdensome when there is more signal.

Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions

It is shown that SDP relaxations also achieve the sharp recovery threshold in the following cases: 1) binary SBM with two clusters of sizes proportional to network size but not necessarily equal; 2) S BM with a fixed number of equal-sized clusters; and 3) binary censored block model with the background graph being Erdös-Rényi.

Community detection and stochastic block models: recent developments

  • E. Abbe
  • Computer Science
    J. Mach. Learn. Res.
  • 2017
The recent developments that establish the fundamental limits for community detection in the stochastic block model are surveyed, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery.

Dimensionality reduction of SDPs through sketching

Exact Recovery in the Stochastic Block Model

An efficient algorithm based on a semidefinite programming relaxation of ML is proposed, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest that it may achieve the threshold.

Interior Point Methods in Semidefinite Programming with Applications to Combinatorial Optimization

It is argued that many known interior point methods for linear programs can be transformed in a mechanical way to algorithms for SDP with proofs of convergence and polynomial time complexity carrying over in a similar fashion.

Consistency Thresholds for the Planted Bisection Model

It is shown that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors.

Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage

This paper proposes the first algorithm to offer provable convergence to an optimal point with an optimal memory footprint, and modifies a standard convex optimization method to work on a sketched version of the decision variable, and can recover the solution from this sketch.

Sketching as a Tool for Numerical Linear Algebra

This survey highlights the recent advances in algorithms for numericallinear algebra that have come from the technique of linear sketching, and considers least squares as well as robust regression problems, low rank approximation, and graph sparsification.

High-Dimensional Probability

A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.