• Corpus ID: 203610268

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

  title={Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings},
  author={Keith Levin and Fred Roosta and Minh Tang and Michael W. Mahoney and Carey E. Priebe},
Graph embeddings, a class of dimensionality reduction techniques designed for relational data, have proven useful in exploring and modeling network structure. Most dimensionality reduction methods allow out-of-sample extensions, by which an embedding can be applied to observations not present in the training set. Applied to graphs, the out-of-sample extension problem concerns how to compute the embedding of a vertex that is added to the graph after an embedding has already been computed. In… 

Figures from this paper

Asymptotics of $\ell_2$ Regularized Network Embeddings

It is proved that concatenating node covariates to (cid:96) 2 regularized node2vec embeddings leads to comparable, when not superior, performance to methods which incorporate node covariate and the network structure in a non-linear manner.

Randomized Spectral Clustering in Large-Scale Stochastic Block Models

It turns out that, under mild conditions, the randomized spectral clustering algorithms lead to the same theoretical bounds as those of the original spectral clusters algorithm, and this work extends the results to degree-corrected stochastic block models.



Out-of-sample extension of graph adjacency spectral embedding

This work considers the problem of obtaining an out-of-sample extension for the adjacency spectral embedding, a procedure for embedding the vertices of a graph into Euclidean space, and presents two different approaches based on a least-squares objective and a maximum-likelihood formulation.

Out-of-sample Extension for Latent Position Graphs

This paper studied the out-of-sample extension for the graph embedding step and its impact on the subsequent inference tasks, and shows that, under the latent position graph model and for sufficiently large $n$, the mapping of the out of-sample vertices is close to its true latent position.

On spectral embedding performance and elucidating network structure in stochastic blockmodel graphs

Abstract Statistical inference on graphs often proceeds via spectral methods involving low-dimensional embeddings of matrix-valued graph representations such as the graph Laplacian or adjacency

A statistical interpretation of spectral embedding: The generalised random dot product graph

A generalisation of the latent position network model known as the random dot product graph is proposed, to allow interpretation of those vector representations as latent position estimates, and the potential to uncover richer latent structure is uncovered.

Scalable out-of-sample extension of graph embeddings using deep neural networks

Limit theorems for eigenvectors of the normalized Laplacian for random graphs

We prove a central limit theorem for the components of the eigenvectors corresponding to the $d$ largest eigenvalues of the normalized Laplacian matrix of a finite dimensional random dot product

A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs

It is proved that this method to estimate block membership of nodes in a random graph generated by a stochastic blockmodel is consistent for assigning nodes to blocks, as only a negligible number of nodes will be misassigned.

Statistical Inference on Random Dot Product Graphs: a Survey

This survey paper describes a comprehensive paradigm for statistical inference on random dot product graphs, a paradigm centered on spectral embeddings of adjacency and Laplacian matrices, and investigates several real-world applications, including community detection and classification in large social networks and the determination of functional and biologically relevant network properties from an exploratory data analysis of the Drosophila connectome.

A central limit theorem for an omnibus embedding of random dot product graphs

An omnibus embedding in which multiple graphs on the same vertex set are jointly embedded into a single space with a distinct representation for each graph is described, achieving near-optimal inference accuracy when graphs arise from a common distribution and yet retains discriminatory power as a test procedure for the comparison of different graphs.

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

This paper employs approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities, and defines the network community profile plot, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales.