• Publications
  • Influence
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
tl;dr
In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). Expand
  • 9,401
  • 1754
  • Open Access
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
tl;dr
In this paper we compare different types of recurrent units in recurrent neural networks that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). Expand
  • 4,523
  • 797
  • Open Access
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
tl;dr
In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora. Expand
  • 940
  • 169
  • Open Access
Theano: A Python framework for fast computation of mathematical expressions
tl;dr
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Expand
  • 1,811
  • 135
  • Open Access
Relational inductive biases, deep learning, and graph networks
tl;dr
We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. Expand
  • 712
  • 85
  • Open Access
Gated Feedback Recurrent Neural Networks
tl;dr
The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. Expand
  • 498
  • 63
  • Open Access
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
tl;dr
We propose a new approach to second-order optimization, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods. Expand
  • 811
  • 60
  • Open Access
How to Construct Deep Recurrent Neural Networks
tl;dr
In this paper, we explore different ways to extend a recurrent neural network (RNN) to a \textit{deep} RNN. Expand
  • 603
  • 57
  • Open Access
Pointing the Unknown Words
tl;dr
The problem of rare and unknown words is an important issue that can potentially influence the performance of many NLP systems including both the traditional count-based and the deep learning models. Expand
  • 340
  • 44
  • Open Access
Policy Distillation
tl;dr
We present a novel method called policy distillation that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient. Expand
  • 253
  • 37
  • Open Access