Graph Attention Networks

  title={Graph Attention Networks},
  author={Petar Velickovic and Guillem Cucurull and Arantxa Casanova and Adriana Romero and Pietro Lio’ and Yoshua Bengio},
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. [] Key Result Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset…

Figures and Tables from this paper

Graph Representation Learning Network via Adaptive Sampling

A new architecture to address issues of scalability and efficiency with GAT and GraphSAGE is proposed that is more efficient and is capable of incorporating different edge type information.

Graph-Revised Convolutional Network

A GCN-based graph revision module is introduced for predicting missing edges and revising edge weights w.r.t. downstream tasks via joint optimization, which shows that GRCN consistently outperforms strong baseline methods by a large margin.

DAGCN: Dual Attention Graph Convolutional Networks

DAGCN automatically learns the importance of neighbors at different hops using a novel attention graph convolution layer, and then employs a second attention component, a self-attention pooling layer, to generalize the graph representation from the various aspects of a matrix graph embedding.

Improving Graph Attention Networks with Large Margin-based Constraints

This work first theoretically demonstrate the over-smoothing behavior of GATs and then develops an approach using constraint on the attention weights according to the class boundary and feature aggregation pattern, which leads to significant improvements over the previous state-of-the-art graph attention methods on all datasets.

Graph Attention Networks with Positional Embeddings

This work proposes a framework, termed Graph Attentional Networks with Positional Embeddings (GAT-POS), to enhance GATs with positional embeddings which capture structural and positional information of the nodes in the graph.

Understanding Attention and Generalization in Graph Neural Networks

This work proposes an alternative recipe and train attention in a weakly-supervised fashion that approaches the performance of supervised models, and, compared to unsupervised models, improves results on several synthetic as well as real datasets.

Graphs, Entities, and Step Mixture

A new graph neural network that considers both edge-based neighborhood relationships and node-based entity features, i.e. Graph Entities with Step Mixture via random walk (GESM), which achieves state-of-the-art or comparable performances on eight benchmark graph datasets comprising transductive and inductive learning tasks.

Sparse Graph Attention Networks

  • Yang YeShihao Ji
  • Computer Science
    IEEE Transactions on Knowledge and Data Engineering
  • 2021
This paper proposes Sparse Graph Attention Networks (SGATs) that learn sparse attention coefficients under an $L_0$-norm regularization, and the learned sparse attentions are then used for all GNN layers, resulting in an edge-sparsified graph, the first graph learning algorithm that sparsifies graphs for the purpose of identifying important relationship between nodes and for robust training.

GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning

This work proposes a unified approach in which a fully-connected network is trained jointly with the graph neural network via parameter sharing, interpolation-based regularization, and self-predicted-targets.

Semi-Supervised and Self-Supervised Classification with Multi-View Graph Neural Networks

This paper proposes a novel insight to aggregate more useful information based on multi-view which does not require deep structures, and introduces a self-supervised technique to learn node representations by contrastive learning on different views.



Gated Graph Sequence Neural Networks

This work studies feature learning techniques for graph-structured inputs and achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.

Deep Convolutional Networks on Graph-Structured Data

This paper develops an extension of Spectral Networks which incorporates a Graph Estimation procedure, that is test on large-scale classification problems, matching or improving over Dropout Networks with far less parameters to estimate.

Inductive Representation Learning on Large Graphs

GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

Semi-Supervised Classification with Graph Convolutional Networks

A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

This work presents a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localized convolutional filters on graphs.

Diffusion-Convolutional Neural Networks

Through the introduction of a diffusion-convolution operation, it is shown how diffusion-based representations can be learned from graph-structured data and used as an effective basis for node classification.

Spectral Networks and Locally Connected Networks on Graphs

This paper considers possible generalizations of CNNs to signals defined on more general domains without the action of a translation group, and proposes two constructions, one based upon a hierarchical clustering of the domain, and another based on the spectrum of the graph Laplacian.

Revisiting Semi-Supervised Learning with Graph Embeddings

On a large and diverse set of benchmark tasks, including text classification, distantly supervised entity extraction, and entity classification, the proposed semi-supervised learning framework shows improved performance over many of the existing models.

Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs

This paper proposes a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features and test the proposed method on standard tasks from the realms of image-, graph-and 3D shape analysis and show that it consistently outperforms previous approaches.

Attention is All you Need

A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.