# Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

@article{Kim2021TransformersGD, title={Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs}, author={Jinwoo Kim and Saeyoon Oh and Seunghoon Hong}, journal={ArXiv}, year={2021}, volume={abs/2110.14416} }

We present a generalization of Transformers to any-order permutation invariant data (sets, graphs, and hypergraphs). We begin by observing that Transformers generalize DeepSets, or first-order (set-input) permutation invariant MLPs. Then, based on recently characterized higher-order invariant MLPs, we extend the concept of self-attention to higher orders and propose higher-order Transformers for order-k data (k = 2 for graphs and k > 2 for hypergraphs). Unfortunately, higher-order Transformers…

## Figures and Tables from this paper

## References

SHOWING 1-10 OF 46 REFERENCES

Are Transformers universal approximators of sequence-to-sequence functions?

- Computer Science, MathematicsICLR
- 2020

It is established that Transformer models are universal approximators of continuous permutation equivariant sequence-to-sequence functions with compact support, which is quite surprising given the amount of shared parameters in these models.

Provably Powerful Graph Networks

- Computer Science, MathematicsNeurIPS
- 2019

This paper proposes a simple model that interleaves applications of standard Multilayer-Perceptron (MLP) applied to the feature dimension and matrix multiplication and shows that a reduced 2-order network containing just scaled identity operator, augmented with a single quadratic operation (matrix multiplication) has a provable 3-WL expressive power.

Invariant and Equivariant Graph Networks

- Computer Science, MathematicsICLR
- 2019

This paper provides a characterization of all permutation invariant and equivariant linear layers for (hyper-)graph data, and shows that their dimension, in case of edge-value graph data, is 2 and 15, respectively.

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

- Computer Science, MathematicsICLR
- 2020

This work develops a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes that significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification.

Universal Invariant and Equivariant Graph Neural Networks

- Computer Science, MathematicsNeurIPS
- 2019

The results show that a GNN defined by a single set of parameters can approximate uniformly well a function defined on graphs of varying size.

A Note on Over-Smoothing for Graph Neural Networks

- Computer Science, MathematicsArXiv
- 2020

It is shown that when the weight matrix satisfies the conditions determined by the spectrum of augmented normalized Laplacian, the Dirichlet energy of embeddings will converge to zero, resulting in the loss of discriminative power.

Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks

- Computer ScienceICML
- 2019

This work presents an attention-based neural network module, the Set Transformer, specifically designed to model interactions among elements in the input set, and reduces the computation time of self-attention from quadratic to linear in the number of Elements in the set.

Random walks on hypergraphs

- Physics, Computer SciencePhysical review. E
- 2020

This work contributes to unraveling the effect of higher-order interactions on diffusive processes in higher- order networks, shedding light on mechanisms at the heart of biased information spreading in complex networked systems.

Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning

- Computer Science, MathematicsAAAI
- 2018

It is shown that the graph convolution of the GCN model is actually a special form of Laplacian smoothing, which is the key reason why GCNs work, but it also brings potential concerns of over-smoothing with many convolutional layers.

Deep Learning on Graphs: A Survey

- Computer Science, MathematicsIEEE Transactions on Knowledge and Data Engineering
- 2022

This survey comprehensively review the different types of deep learning methods on graphs by dividing the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks,graph autoencoders, graph reinforcement learning, and graph adversarial methods.