• Computer Science
  • Published in NIPS 2017

Attention is All you Need

@inproceedings{Vaswani2017AttentionIA,
  title={Attention is All you Need},
  author={Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin},
  booktitle={NIPS},
  year={2017}
}
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. [...] Key Result We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.Expand Abstract

Citations

Publications citing this paper.
SHOWING 1-10 OF 5,523 CITATIONS

Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing

VIEW 17 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Crowd Transformer Network

VIEW 16 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Semi-Supervised Disfluency Detection

VIEW 13 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

A Joint Sentence Scoring and Selection Framework for Neural Extractive Document Summarization

VIEW 10 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

A Study of the Tasks and Models in Machine Reading Comprehension

VIEW 4 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

BERT-of-Theseus: Compressing BERT by Progressive Module Replacing

VIEW 15 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning

VIEW 9 EXCERPTS
CITES METHODS, BACKGROUND & RESULTS
HIGHLY INFLUENCED

CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus

VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Controlling Computation versus Quality for Neural Sequence Models

VIEW 9 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2016
2020

CITATION STATISTICS

  • 1,575 Highly Influenced Citations

  • Averaged 1,709 Citations per year from 2017 through 2019

  • 255% Increase in citations per year in 2019 over 2018

References

Publications referenced by this paper.
SHOWING 1-10 OF 38 REFERENCES

Can Active Memory Replace Attention?

VIEW 5 EXCERPTS

Grammar as a Foreign Language

VIEW 4 EXCERPTS

Neural Machine Translation in Linear Time

VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Neural Machine Translation by Jointly Learning to Align and Translate

VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Factorization tricks for LSTM networks

VIEW 1 EXCERPT