GraphWorld: Fake Graphs Bring Real Insights for GNNs

@article{Palowitch2022GraphWorldFG,
  title={GraphWorld: Fake Graphs Bring Real Insights for GNNs},
  author={John Palowitch and Anton Tsitsulin and Brandon Mayer and Bryan Perozzi},
  journal={ArXiv},
  year={2022},
  volume={abs/2203.00112}
}
Despite advances in the field of Graph Neural Networks (GNNs), only a small number (~5) of datasets are currently used to evaluate new models. This continued reliance on a handful of datasets provides minimal insight into the performance differences betweenmodels, and is especially challenging for industrial practitioners who are likely to have datasets which are very different from academic benchmarks. In the course of our work on GNN infrastructure and open-source software at Google, we have… 

Figures and Tables from this paper

ProGNNosis: A Data-driven Model to Predict GNN Computation Time Using Graph Metrics
TLDR
This paper proposes PROGNNOSIS, a data-driven model that can predict the GNN training time of a given GNN model running over a graph of arbitrary characteristics by inspecting the input graph metrics, and shows that it helps achieve an average speedup over randomly selecting a graph representation in multiple widely used GNN models.
Taxonomy of Benchmarks in Graph Representation Learning
TLDR
A principled approach to taxonomize benchmarking datasets according to a sensitivity profile that is based on how much GNN performance changes due to a collection of graph perturbations is developed.
EMBEDDING OF BIPARTITE GRAPHS VIA GRAPH NEURAL NETWORKS WITH APPLICATION TO USER-ITEM RECOMMENDATIONS
TLDR
This dissertation proposes a representation methodology for bipartite networks using a graph neural network capable of synthesizing the network structure and the attributes of the vertices, and aims to represent vertices in vector space to maximize the distance between vertices of different groups and minimize thedistance between members of the same group.
Benchmarking Graph Neural Networks
TLDR
A reproducible GNN benchmarking framework is introduced, with the facility for researchers to add new models conveniently for arbitrary datasets, and a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs).

References

SHOWING 1-10 OF 51 REFERENCES
Open Graph Benchmark: Datasets for Machine Learning on Graphs
TLDR
The OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs, indicating fruitful opportunities for future research.
Can graph neural networks count substructures?
TLDR
A local relational pooling approach with inspirations from Murphy et al. (2019) is proposed and demonstrated that it is not only effective for substructure counting but also able to achieve competitive performance on real-world tasks.
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
TLDR
This paper presents a new Graph Neural Network type using feature-wise linear modulation (FiLM), which outperforms baseline methods on a regression task on molecular graphs and performs competitively on other tasks.
How Powerful are Graph Neural Networks?
TLDR
This work characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures, and develops a simple architecture that is provably the most expressive among the class of GNNs.
Graph Neural Networks with convolutional ARMA filters
TLDR
A novel graph convolutional layer inspired by the auto-regressive moving average (ARMA) filter is proposed that provides a more flexible frequency response, is more robust to noise, and better captures the global graph structure.
Community detection and stochastic block models: recent developments
  • E. Abbe
  • Computer Science
    J. Mach. Learn. Res.
  • 2017
TLDR
The recent developments that establish the fundamental limits for community detection in the stochastic block model are surveyed, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Friends and neighbors on the Web
Kipf and MaxWelling
  • arXiv preprint arXiv:1609.02907
  • 2016
Synthetic Graph Generation to Benchmark Graph Learning
TLDR
This work develops a fully-featured synthetic graph generator that allows deep inspection of different models and argues that synthetic graph generations allows for thorough investigation of algorithms and provides more insights than overfitting on three citation datasets.
...
...