• Corpus ID: 203610268

# Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

@article{Levin2019LimitTF,
title={Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings},
author={Keith Levin and Fred Roosta and Minh Tang and Michael W. Mahoney and Carey E. Priebe},
journal={ArXiv},
year={2019},
volume={abs/1910.00423}
}
• Published 29 September 2019
• Computer Science, Mathematics
• ArXiv
Graph embeddings, a class of dimensionality reduction techniques designed for relational data, have proven useful in exploring and modeling network structure. Most dimensionality reduction methods allow out-of-sample extensions, by which an embedding can be applied to observations not present in the training set. Applied to graphs, the out-of-sample extension problem concerns how to compute the embedding of a vertex that is added to the graph after an embedding has already been computed. In…
2 Citations

## Figures from this paper

It is proved that concatenating node covariates to (cid:96) 2 regularized node2vec embeddings leads to comparable, when not superior, performance to methods which incorporate node covariate and the network structure in a non-linear manner.
• Computer Science
Journal of Computational and Graphical Statistics
• 2022
It turns out that, under mild conditions, the randomized spectral clustering algorithms lead to the same theoretical bounds as those of the original spectral clusters algorithm, and this work extends the results to degree-corrected stochastic block models.

## References

SHOWING 1-10 OF 54 REFERENCES

• Mathematics, Computer Science
ICML
• 2018
This work considers the problem of obtaining an out-of-sample extension for the adjacency spectral embedding, a procedure for embedding the vertices of a graph into Euclidean space, and presents two different approaches based on a least-squares objective and a maximum-likelihood formulation.
• Computer Science, Mathematics
• 2013
This paper studied the out-of-sample extension for the graph embedding step and its impact on the subsequent inference tasks, and shows that, under the latent position graph model and for sufficiently large $n$, the mapping of the out of-sample vertices is close to its true latent position.
• Computer Science, Mathematics
Network Science
• 2019
Abstract Statistical inference on graphs often proceeds via spectral methods involving low-dimensional embeddings of matrix-valued graph representations such as the graph Laplacian or adjacency
• Computer Science, Mathematics
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
• 2022
A generalisation of the latent position network model known as the random dot product graph is proposed, to allow interpretation of those vector representations as latent position estimates, and the potential to uncover richer latent structure is uncovered.
• Mathematics, Computer Science
The Annals of Statistics
• 2018
We prove a central limit theorem for the components of the eigenvectors corresponding to the $d$ largest eigenvalues of the normalized Laplacian matrix of a finite dimensional random dot product
• Computer Science, Mathematics
• 2011
It is proved that this method to estimate block membership of nodes in a random graph generated by a stochastic blockmodel is consistent for assigning nodes to blocks, as only a negligible number of nodes will be misassigned.
• Computer Science, Mathematics
J. Mach. Learn. Res.
• 2017
This survey paper describes a comprehensive paradigm for statistical inference on random dot product graphs, a paradigm centered on spectral embeddings of adjacency and Laplacian matrices, and investigates several real-world applications, including community detection and classification in large social networks and the determination of functional and biologically relevant network properties from an exploratory data analysis of the Drosophila connectome.
• Computer Science, Mathematics
• 2017
An omnibus embedding in which multiple graphs on the same vertex set are jointly embedded into a single space with a distinct representation for each graph is described, achieving near-optimal inference accuracy when graphs arise from a common distribution and yet retains discriminatory power as a test procedure for the comparison of different graphs.
• Computer Science
Internet Math.
• 2009
This paper employs approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities, and defines the network community profile plot, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales.