Spectral clustering and the high-dimensional stochastic blockmodel

  title={Spectral clustering and the high-dimensional stochastic blockmodel},
  author={Karl Rohe and Sourav Chatterjee and Bin Yu},
  journal={Annals of Statistics},
Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social ne tworks, representing people who communicate with each other, are one example. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasi ble method to discover these communities. The Stochastic Block Model (Holland et al., 1983) is a… 

Figures and Tables from this paper

Consistency of Spectral Clustering on Hierarchical Stochastic Block Models

A recursive bi-partitioning algorithm is developed that divides the network into two communities based on the Fiedler vector of the unnormalized graph Laplacian and repeats the split until a stopping rule indicates no further community structures.

On the efficacy of higher-order spectral clustering under weighted stochastic block models

It turns out that when the network is dense with weak signal of weights, higher-order spectral clustering can really lead to the performance gain in clustering.

Community Detection and Classification in Hierarchical Stochastic Blockmodels

This work proposes a robust, scalable, integrated methodology for community detection and community comparison in graphs, and addresses the problem of locating similar sub-communities in a partially reconstructed Drosophila connectome and in the social network Friendster.

Covariate-assisted spectral clustering

This work applies the clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates, and yields results superior both to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis.

Spectral Clustering on Spherical Coordinates Under the Degree-Corrected Stochastic Blockmodel

A novel spectral clustering algorithm is proposed for community detection under the degree-corrected stochastic blockmodel based on a transformation of the spectral embedding to spherical coordinates, and a novel modeling assumption in the transformed space.

Higher-Order Spectral Clustering under Superimposed Stochastic Block Model

Non-asymptotic upper bounds on the misclustering error of spectral community detection for a SupSBM setting in which triangles or 3-uniform hyperedges are superimposed with undirected edges are proved.

Spectral Clustering for Multiple Sparse Networks: I

The spectral clustering methods are shown to work under sufficiently mild conditions on the number of multiple networks to detect associative community structures, even if all the individual networks are sparse and most of theindividual networks are below community detectability threshold.

Consistency of spectral clustering in stochastic block models

It is shown that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as $\log n$ with $n$ the number of nodes.

Latent structure blockmodels for Bayesian spectral graph clustering

A class of models called latent structure block models (LSBM) is proposed, allowing for graph clustering when community-specific one-dimensional manifold structure is present, and is shown to have a good performance on simulated and real-world network data.

On Spectral Clustering for Sparse Stochastic Block Models

  • Computer Science
  • 2017
It is shown that the algorithm can recover the hidden communities with vanishing misclustering rate even when the expected node degrees grow only logarithmically in the size of the network.



Model‐based clustering for social networks

A new model is proposed, the latent position cluster model, under which the probability of a tie between two actors depends on the distance between them in an unobserved Euclidean ‘social space’, and the actors’ locations in the latent social space arise from a mixture of distributions, each corresponding to a cluster.

Statistical properties of community structure in large social and information networks

It is found that a generative model, in which new edges are added via an iterative "forest fire" burning process, is able to produce graphs exhibiting a network community structure similar to that observed in nearly every network dataset examined.

Community structure in social and biological networks

  • M. GirvanM. Newman
  • Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 2002
This article proposes a method for detecting communities, built around the idea of using centrality indices to find community boundaries, and tests it on computer-generated and real-world graphs whose community structure is already known and finds that the method detects this known structure with high sensitivity and reliability.

Latent Space Approaches to Social Network Analysis

This work develops a class of models where the probability of a relation between actors depends on the positions of individuals in an unobserved “social space,” and proposes Markov chain Monte Carlo procedures for making inference on latent positions and the effects of observed covariates.

Community detection in graphs

Finding and evaluating community structure in networks.

  • M. NewmanM. Girvan
  • Computer Science
    Physical review. E, Statistical, nonlinear, and soft matter physics
  • 2004
It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.

A Survey of Statistical Network Models

An overview of the historical development of statistical network modeling is overviewed and a number of examples that have been studied in the network literature are introduced, and a subsequent discussion focuses on anumber of prominent static and dynamic network models and their interconnections.

Statistical mechanics of complex networks

A simple model based on these two principles was able to reproduce the power-law degree distribution of real networks, indicating a heterogeneous topology in which the majority of the nodes have a small degree, but there is a significant fraction of highly connected nodes that play an important role in the connectivity of the network.