• Corpus ID: 245704627

Asymptotics of $\ell_2$ Regularized Network Embeddings

@inproceedings{Davison2022AsymptoticsO,
  title={Asymptotics of \$\ell\_2\$ Regularized Network Embeddings},
  author={Andrew Davison},
  year={2022}
}
  • A. Davison
  • Published 5 January 2022
  • Computer Science
A common approach to solving prediction tasks on large networks, such as node classification or link prediction, begin by learning a Euclidean embedding of the nodes of the network, from which traditional machine learning methods can then be applied. This includes methods such as DeepWalk and node2vec, which learn embeddings by optimizing stochastic losses formed over subsamples of the graph at each iteration of stochastic gradient descent. In this paper, we study the effects of adding an (cid… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 72 REFERENCES

Asymptotics of Network Embeddings Learned via Subsampling

TLDR
This work encapsulates representation methods using a subsampling approach, such as node2vec, into a single unifying framework and proves, under the assumption that the graph is exchangeable, that the distribution of the learned embedding vectors asymptotically decouples.

node2vec: Scalable Feature Learning for Networks

TLDR
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.

Inductive Representation Learning on Large Graphs

TLDR
GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

Consistency of random-walk based network embedding algorithms

TLDR
This paper established large-sample error bounds and prove consistent community recovery of node2vec/DeepWalk embedding followed by k-means clustering and suggests using larger window sizes, or equivalently, taking longer random walks, in order to attain better convergence rate for the resulting embeddings.

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

TLDR
This paper proves that when the underlying graph is generated according to a latent space model called the random dot product graph, an out-of-sample extension based on a least-squares objective obeys a central limit theorem about the true latent position of the out- of-sample vertex.

Representation Learning on Graphs: Methods and Applications

TLDR
A conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks are provided.

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec

TLDR
The NetMF method offers significant improvements over DeepWalk and LINE for conventional network mining tasks and provides the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian.

Graph Neural Networks Exponentially Lose Expressive Power for Node Classification

TLDR
The theory enables us to relate the expressive power of GCNs with the topological information of the underlying graphs inherent in the graph spectra and provides a principled guideline for weight normalization of graph NNs.

LINE: Large-scale Information Network Embedding

TLDR
A novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.
...