Self-Attention with Relative Position Representations

@article{Shaw2018SelfAttentionWR,
  title={Self-Attention with Relative Position Representations},
  author={Peter Shaw and Jakob Uszkoreit and Ashish Vaswani},
  journal={ArXiv},
  year={2018},
  volume={abs/1803.02155}
}
Relying entirely on an attention mechanism, the Transformer introduced by Vaswani et al. (2017) achieves state-of-the-art results for machine translation. In contrast to recurrent and convolutional neural networks, it does not explicitly model relative or absolute position information in its structure. Instead, it requires adding representations of absolute positions to its inputs. In this work we present an alternative approach, extending the self-attention mechanism to efficiently consider… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 158 CITATIONS

SANST: A Self-Attentive Network for Next Point-of-Interest Recommendation

VIEW 6 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Insertion-based Decoding with Automatically Inferred Generation Order

VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Modeling Recurrence for Transformer

VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Modeling Recurrence for Transformer

VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

On the Relation between Position Information and Sentence Length in Neural Machine Translation

VIEW 12 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Time Interval Aware Self-Attention for Sequential Recommendation

VIEW 5 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Attention Augmented Convolutional Networks

VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND

Attentional Policies for Cross-Context Multi-Agent Reinforcement Learning

VIEW 5 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

BP-Transformer: Modelling Long-Range Context via Binary Partitioning

VIEW 4 EXCERPTS
CITES BACKGROUND
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2017
2020

CITATION STATISTICS

  • 42 Highly Influenced Citations

  • Averaged 48 Citations per year from 2017 through 2019

  • 244% Increase in citations per year in 2019 over 2018

References

Publications referenced by this paper.
SHOWING 1-10 OF 14 REFERENCES

Attention is All you Need

VIEW 9 EXCERPTS

Rethinking the Inception Architecture for Computer Vision

VIEW 1 EXCERPT
HIGHLY INFLUENTIAL

Graph Attention Networks

VIEW 1 EXCERPT

Layer Normalization

VIEW 1 EXCERPT

End - to - end memory networks

  • Ilya Sutskever, Oriol Vinyals, Quoc V Le
  • Advances in neural information processing systems
  • 2015

End-to-end memory networks. In Advances in neural information processing systems

  • Sainbayar Sukhbaatar, Jason Weston, Rob Fergus
  • 2015
VIEW 1 EXCERPT