Corpus ID: 5201925

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

@article{Chung2014EmpiricalEO,
  title={Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling},
  author={Junyoung Chung and Çaglar G{\"u}lçehre and Kyunghyun Cho and Yoshua Bengio},
  journal={ArXiv},
  year={2014},
  volume={abs/1412.3555}
}
In this paper we compare different types of recurrent units in recurrent neural networks (RNNs. [...] Key Result Also, we found GRU to be comparable to LSTM.Expand
Evaluation of Gated Recurrent Neural Networks in Music Classification Tasks
TLDR
A key result is a significant improvement of classification accuracy achieved by training the recurrent network on random short subsequences of the vector sequences in the training set. Expand
Memory visualization for gated recurrent neural networks in speech recognition
Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU).Expand
The Statistical Recurrent Unit
TLDR
The efficacy of SRUs as compared to LSTMs and GRUs is shown in an unbiased manner by optimizing respective architectures' hyperparameters for both synthetic and real-world tasks. Expand
Investigating gated recurrent neural networks for acoustic modeling
TLDR
GRU usually performs better than LSTM, for possibly GRU is able to modulate the previous memory content through the learned reset gates, helping to model the long-span dependence more efficiently for speech sequence and LSTMP shows comparable performance with GRU. Expand
GRUV : Algorithmic Music Generation using Recurrent Neural Networks
We compare the performance of two different types of recurrent neural networks (RNNs) for the task of algorithmic music generation, with audio waveforms as input. In particular, we focus on RNNs thatExpand
Simplified minimal gated unit variations for recurrent neural networks
  • Joel Heck, F. Salem
  • Computer Science, Mathematics
  • 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)
  • 2017
TLDR
Three model variants of the minimal gated unit which further simplify that design by reducing the number of parameters in the forget-gate dynamic equation are introduced and shown similar accuracy to the MGU model while using fewer parameters and thus lower training expense. Expand
Residual Recurrent Neural Networks for Learning Sequential Representations
TLDR
The results show that the RNN unit reformulate to learn the residual functions with reference to the hidden state gives state-of-the-art performance, outperforms LSTM and GRU layers in terms of speed, and supports an accuracy competitive with that of the other methods. Expand
Simplified gating in long short-term memory (LSTM) recurrent neural networks
  • Yuzhen Lu, F. Salem
  • Computer Science, Mathematics
  • 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)
  • 2017
TLDR
Three new parameter-reduced variants obtained by eliminating combinations of the input signal, bias, and hidden unit signals from individual gating signals can achieve comparable performance to the standard LSTM model with less (adaptive) parameters. Expand
Sequence Modeling using Gated Recurrent Neural Networks
TLDR
This paper uses Recurrent Neural Networks to capture and model human motion data and generate motions by prediction of the next immediate data point at each time-step and demonstrates that this model is able to capture long-term dependencies in data and generated realistic motions. Expand
Going Wider: Recurrent Neural Network With Parallel Cells
TLDR
A simple technique called parallel cells (PCs) is proposed to enhance the learning ability of Recurrent Neural Network (RNN) by running multiple small RNN cells rather than one single large cell in each layer. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Learning long-term dependencies with gradient descent is difficult
TLDR
This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods. Expand
Advances in optimizing recurrent networks
TLDR
Experiments reported here evaluate the use of clipping gradients, spanning longer time ranges with leaky integration, advanced momentum techniques, using more powerful output probability models, and encouraging sparser gradients to help symmetry breaking and credit assignment. Expand
Speech recognition with deep recurrent neural networks
TLDR
This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. Expand
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand
Generating Sequences With Recurrent Neural Networks
This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time. The approachExpand
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
TLDR
A probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences that outperforms many traditional models of polyphonic music on a variety of realistic datasets is introduced. Expand
Supervised Sequence Labelling with Recurrent Neural Networks
  • A. Graves
  • Computer Science
  • Studies in Computational Intelligence
  • 2008
TLDR
A new type of output layer that allows recurrent networks to be trained directly for sequence labelling tasks where the alignment between the inputs and the labels is unknown, and an extension of the long short-term memory network architecture to multidimensional data, such as images and video sequences. Expand
On the difficulty of training recurrent neural networks
TLDR
This paper proposes a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem and validates empirically the hypothesis and proposed solutions. Expand
Learning Recurrent Neural Networks with Hessian-Free Optimization
TLDR
This work solves the long-outstanding problem of how to effectively train recurrent neural networks on complex and difficult sequence modeling problems which may contain long-term data dependencies and offers a new interpretation of the generalized Gauss-Newton matrix of Schraudolph which is used within the HF approach of Martens. Expand
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks
TLDR
It is shown that multilayer perceptrons (MLP) consisting of the L p units achieve the state-of-the-art results on a number of benchmark datasets and the proposed L p unit is evaluated on the recently proposed deep recurrent neural networks (RNN). Expand
...
1
2
3
...