Share This Author
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.
Neural Machine Translation by Jointly Learning to Align and Translate
It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
- Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, Yoshua Bengio
- Computer ScienceSSST@EMNLP
- 3 September 2014
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
An attention based model that automatically learns to describe the content of images is introduced that can be trained in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound.
Describing Videos by Exploiting Temporal Structure
- L. Yao, Atousa Torabi, Aaron C. Courville
- Computer ScienceIEEE International Conference on Computer Vision…
- 27 February 2015
This work proposes an approach that successfully takes into account both the local and global temporal structure of videos to produce descriptions and proposes a temporal attention mechanism that allows to go beyond local temporal modeling and learns to automatically select the most relevant temporal segments given the text-generating RNN.
Theano: A Python framework for fast computation of mathematical expressions
The performance of Theano is compared against Torch7 and TensorFlow on several machine learning models and recently-introduced functionalities and improvements are discussed.
Unsupervised Neural Machine Translation
This work proposes a novel method to train an NMT system in a completely unsupervised manner, relying on nothing but monolingual corpora, and consists of a slightly modified attentional encoder-decoder model that can be trained on monolingUAL corpora alone using a combination of denoising and backtranslation.
Recurrent Neural Networks for Multivariate Time Series with Missing Values
- Zhengping Che, S. Purushotham, Kyunghyun Cho, D. Sontag, Yan Liu
- Computer ScienceScientific Reports
- 6 June 2016
Novel deep learning models are developed based on Gated Recurrent Unit, a state-of-the-art recurrent neural network that takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results.
Attention-Based Models for Speech Recognition
- J. Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, Yoshua Bengio
- Computer ScienceNIPS
- 24 June 2015
The attention-mechanism is extended with features needed for speech recognition and a novel and generic method of adding location-awareness to the attention mechanism is proposed to alleviate the issue of high phoneme error rate.