• Publications
  • Influence
Neural Machine Translation by Jointly Learning to Align and Translate
Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a singleExpand
  • 12,453
  • 1842
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlengthExpand
  • 9,137
  • 1698
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a longExpand
  • 4,422
  • 772
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and aExpand
  • 2,674
  • 518
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. We describe how we can trainExpand
  • 4,980
  • 499
Theano: A Python framework for fast computation of mathematical expressions
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the mostExpand
  • 1,791
  • 135
Describing Videos by Exploiting Temporal Structure
Recent progress in using recurrent neural networks (RNNs) for image description has motivated the exploration of their application for video description. However, while images are static, workingExpand
  • 692
  • 128
Unsupervised Neural Machine Translation
In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of large parallel corpora poses a major practical problem for many language pairs. There have beenExpand
  • 381
  • 74
Attention-Based Models for Speech Recognition
Recurrent sequence generators conditioned on input data through an attention mechanism have recently shown very good performance on a range of tasks including machine translation, handwritingExpand
  • 1,219
  • 73
Learning Distributed Representations of Sentences from Unlabelled Data
Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentenceExpand
  • 342
  • 72