Corpus ID: 88514029

A nonparametric two-sample hypothesis testing problem for random dot product graphs

  title={A nonparametric two-sample hypothesis testing problem for random dot product graphs},
  author={Minh Tang and Avanti Athreya and Daniel Lewis Sussman and Vince Lyzinski and Carey E. Priebe},
  journal={arXiv: Statistics Theory},
nite-dimensional random dot product graphs have generating latent positions that are independently drawn from the same distribution, or distributions that are related via scaling or projection. We propose a test statistic that is a kernel-based function of the adjacency spectral embedding for each graph. We obtain a limiting distribution for our test statistic under the null and we show that our test procedure is consistent across a broad range of alternatives. 1. Introduction. The… Expand

Figures and Tables from this paper

A Semiparametric Two-Sample Hypothesis Testing Problem for Random Graphs
ABSTRACT Two-sample hypothesis testing for random graphs arises naturally in neuroscience, social networks, and machine learning. In this article, we consider a semiparametric problem of two-sampleExpand
Empirical Bayes estimation for random dot product graph representation of the stochastic blockmodel
Network models are increasingly used to model datasets that involve interacting units, particularly random graph models where the vertices represent individual entities and the edges represent theExpand
A central limit theorem for an omnibus embedding of multiple random graphs and implications for multiscale network inference
An "omnibus" embedding in which multiple graphs on the same vertex set are jointly embedded into a single space with a distinct representation for each graph is described, which achieves near-optimal inference accuracy and allows the identification of specific brain regions associated with population-level differences. Expand
On Estimation and Inference in Latent Structure Random Graphs
We define a latent structure model (LSM) random graph as a random dot product graph (RDPG) in which the latent position distribution incorporates both probabilistic and geometric constraints,Expand
Improving Power of 2-Sample Random Graph Tests with Applications in Connectomics
It is shown that adapting multiscale graph correlation (MGC) to answer this question results in an equivalent test which outperforms several existing methods, and is demonstrated that on a real brain network, MGC is able to detect differences between two sides of a larval Drosophila brain network. Expand
This paper focuses on graphs with a fixed number of labeled nodes and introduces natural notions of center and a depth function for graphs that evolve in time, to develop several statistical techniques including testing, supervised and unsupervised classification, and a notion of principal component sets in the space of graphs. Expand
Beyond the adjacency matrix: random line graphs and inference for networks with edge attributes
It is established that although naive spectral decompositions can fail to extract necessary signal for edge clustering, there exist signal-preserving singular subspaces of the line graph that can be recovered through a carefully-chosen projection, and one can consistently estimate edge latent positions in a random line graph. Expand
On spectral embedding performance and elucidating network structure in stochastic blockmodel graphs
Abstract Statistical inference on graphs often proceeds via spectral methods involving low-dimensional embeddings of matrix-valued graph representations such as the graph Laplacian or adjacencyExpand
Asymptotically efficient estimators for stochastic blockmodels: the naive MLE, the rank-constrained MLE, and the spectral
We establish asymptotic normality results for estimation of the block probability matrix $\mathbf{B}$ in stochastic blockmodel graphs using spectral embedding when the average degrees grows at theExpand
Information Recovery in Shuffled Graphs via Graph Matching
  • V. Lyzinski
  • Mathematics, Computer Science
  • IEEE Transactions on Information Theory
  • 2018
An information theoretic foundation is provided for understanding the practical impact that errorfully observed vertex correspondences can have on subsequent inference, and the capacity of graph matching methods to recover the lost vertex alignment and inferential performance. Expand


Two-sample Hypothesis Testing for Random Dot Product Graphs via Adjacency Spectral Embedding
A valid test is proposed for the hypothesis that two finite-dimensional random dot product graphs on a common vertex set have the same generating latent positions or have Generating latent positions that are scaled or diagonal transformations of one another. Expand
A Kernel Two-Sample Test
This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD). Expand
A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs
It is proved that this method to estimate block membership of nodes in a random graph generated by a stochastic blockmodel is consistent for assigning nodes to blocks, as only a negligible number of nodes will be misassigned. Expand
Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs
If class labels are observed for a number of vertices tending to infinity, then it is shown that the remaining vertices can be classified with error converging to Bayes optimal using the $(k)$-nearest-neighbors classification rule. Expand
Nonparametric estimation and testing of exchangeable graph models
A specific estimator is built using the proposed 3-step procedure, which combines probability matrix estimation by Universal Singular Value Thresholding (USVT) and empirical degree sorting of the observed adjacency matrix, and it is proved that this estimation is consistent. Expand
A comparative power analysis of the maximum degree and size invariants for random graph inference
Abstract Let p , s ∈ ( 0 , 1 ] with s > p , let m , n ∈ N with 1 m n , and define V={1,…,n}. Let ER(n,p) denote the random graph model on V where each edge is independently included in the graph withExpand
Universally consistent vertex classification for latent positions graphs
In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate feature maps for latent position graphs with positive definite link function $\kappa$,Expand
A Limit Theorem for Scaled Eigenvectors of Random Dot Product Graphs
Abstract We prove a central limit theorem for the components of the largest eigenvectors of the adjacency matrix of a finite-dimensional random dot product graph whose true latent positions areExpand
Equivalence of distance-based and RKHS-based statistics in hypothesis testing
It is shown that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests. Expand
Graph Kernels
A unified framework to study graph kernels is presented and a kernel that is close to the optimal assignment kernel of kernel of Frohlich et al. (2006) yet provably positive semi-definite is provided. Expand