• Publications
  • Influence
Long Short-Term Memory
TLDR
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Expand
Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks
TLDR
This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems of sequence learning and post-processing. Expand
Deep learning in neural networks: An overview
TLDR
This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks. Expand
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
TLDR
The main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Expand
LSTM: A Search Space Odyssey
TLDR
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment. Expand
Learning to Forget: Continual Prediction with LSTM
TLDR
This work identifies a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset, and proposes a novel, adaptive forget gate that enables an LSTm cell to learn to reset itself at appropriate times, thus releasing internal resources. Expand
Multi-column deep neural networks for image classification
TLDR
On the very competitive MNIST handwriting benchmark, this method is the first to achieve near-human performance and improves the state-of-the-art on a plethora of common image classification benchmarks. Expand
Learning to forget: continual prediction with LSTM
TLDR
This work identifies a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends and proposes an adaptive "forget gate" that enables an L STM cell to learn to reset itself at appropriate times, thus releasing internal resources. Expand
Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction
TLDR
A novel convolutional auto-encoder (CAE) for unsupervised feature learning that initializing a CNN with filters of a trained CAE stack yields superior performance on a digit and an object recognition benchmark. Expand
A Novel Connectionist System for Unconstrained Handwriting Recognition
TLDR
This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies, significantly outperforming a state-of-the-art HMM-based system. Expand
...
1
2
3
4
5
...