Recipe for a General, Powerful, Scalable Graph Transformer

  title={Recipe for a General, Powerful, Scalable Graph Transformer},
  author={Ladislav Ramp{\'a}{\vs}ek and Mikhail Galkin and Vijay Prakash Dwivedi and Anh Tuan Luu and Guy Wolf and D. Beaini},
We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encoding, and what differentiates them. In this paper, we summarize the different types of encodings with… 


Open Graph Benchmark: Datasets for Machine Learning on Graphs
The OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs, indicating fruitful opportunities for future research.
Benchmarking Graph Neural Networks
A reproducible GNN benchmarking framework is introduced, with the facility for researchers to add new models conveniently for arbitrary datasets, and a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs).
A Large-Scale Database for Graph Representation Learning
This work introduces Mal net, the largest public graph database ever constructed, representing a large-scale ontology of software function call graphs, and provides a detailed analysis of MalNet, discussing its properties and provenance.
Graph Neural Networks with Learnable Structural and Positional Representations
This work proposes to decouple structural and positional representations to make easy for the network to learn these two essential properties, and introduces a novel generic architecture which is called LSPE (Learnable Structural and Positional Encodings).
Rethinking Graph Transformers with Spectral Attention
The Spectral Attention Network (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian spectrum to learn the position of each node in a given graph, is presented, becoming the first fully-connected architecture to perform well on graph benchmarks.
A Generalization of Transformer Networks to Graphs
A graph transformer with four new properties compared to the standard model, which closes the gap between the original transformer, which was designed for the limited case of line graphs, and graph neural networks, that can work with arbitrary graphs.
Principal Neighbourhood Aggregation for Graph Nets
This work proposes Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator) and compares the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory.
Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets
This technical note describes the recent updates of Graphormer, including architecture design modifications, and the adaption to 3D molecular dynamics simulation, and shows that with a global receptive field and an adaptive aggregation strategy, graphormer is more powerful than classic message-passing-based GNNs.
Sign and Basis Invariant Networks for Spectral Graph Representation Learning
SignNet and BasisNet are introduced — new neural architectures that are invariant to all requisite symmetries and hence process collections of eigenspaces in a principled manner and can approximate any continuous function of eigenvectors with the proper invariances.
Structure-Aware Transformer for Graph Representation Learning
This work proposes the Structure-Aware Transformer, a class of simple and accessible graph Transformers built upon a new self-attention mechanism that systematically improves performance relative to the base GNN model, successfully combining the advantages of GNNs and Transformers.