• Publications
  • Influence
Theano: A Python framework for fast computation of mathematical expressions
TLDR
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Expand
Attention-Based Models for Speech Recognition
TLDR
We extend the attention-mechanism with features needed for speech recognition and propose a novel method of adding location-awareness to the attention mechanism to alleviate this issue. Expand
Regularizing Neural Networks by Penalizing Confident Output Distributions
TLDR
We systematically explore regularizing neural networks by penalizing low entropy output distributions, which has been shown to improve exploration in reinforcement learning, acts as a strong regularizer in supervised learning. Expand
End-to-end attention-based large vocabulary speech recognition
TLDR
We investigate an alternative method for sequence modelling based on an attention mechanism that allows a Recurrent Neural Network (RNN) to learn alignments between sequences of input frames and output labels. Expand
Robust 3D Action Recognition with Random Occupancy Patterns
TLDR
We study the problem of action recognition from depth sequences captured by depth cameras, where noise and occlusion are common problems because they are captured with a single commodity camera. Expand
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
TLDR
In this work, we explore a variety of structural and optimization improvements to our LAS model which significantly improve performance. Expand
End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
TLDR
We replace the Hidden Markov Model (HMM) which is traditionally used in in continuous speech recognition with a bi-directional recurrent neural network encoder coupled to a recurrent neural Network decoder that directly emits a stream of phonemes. Expand
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
TLDR
We present a recurrent encoder-decoder deep neural network architecture that directly translates speech in one language into text in another. Expand
Unsupervised Speech Representation Learning Using WaveNet Autoencoders
TLDR
We introduce a regularization scheme that forces the representations to focus on the phonetic content of the utterance and report performance comparable with the top entries in the ZeroSpeech 2017 unsupervised acoustic unit discovery task. Expand
Towards Better Decoding and Language Model Integration in Sequence to Sequence Models
TLDR
We analyse an attention-based seq2seq speech recognition system that directly transcribes recordings into characters. Expand
...
1
2
3
4
5
...