Simplified minimal gated unit variations for recurrent neural networks

  title={Simplified minimal gated unit variations for recurrent neural networks},
  author={Joel Heck and Fathi M. Salem},
  journal={2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)},
  • Joel Heck, F. Salem
  • Published 12 January 2017
  • Computer Science, Mathematics
  • 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)
Recurrent neural networks with various types of hidden units have been used to solve a diverse range of problems involving sequence data. Two of the most recent proposals, gated recurrent units (GRU) and minimal gated units (MGU), have shown comparable promising results on example public datasets. In this paper, we introduce three model variants of the minimal gated unit which further simplify that design by reducing the number of parameters in the forget-gate dynamic equation. These three… Expand
Radically Simplifying Gated Recurrent Architectures Without Loss of Performance
  • J. Boardman, Y. Xie
  • Computer Science
  • 2019 IEEE International Conference on Big Data (Big Data)
  • 2019
This study demonstrates that it is possible to radically simplify the MGU without significant loss of performance for some tasks and datasets, and an extraordinarily simple Forget Gate architecture performs just as well as an MGU on the given task. Expand
Gates are not what you need in RNNs
This paper proposes a new recurrent cell called Residual Recurrent Unit (RRU), which beats traditional cells and does not employ a single gate, based on the residual shortcut connection together with linear transformations, ReLU, and normalization. Expand
A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures
The LSTM cell and its variants are reviewed and their variants are explored to explore the learning capacity of the LSTm cell and the L STM networks are divided into two broad categories:LSTM-dominated networks and integrated LSTS networks. Expand
This paper systemically introduces variants of the LSTM RNNs, referred to as SLIM LSTMs, which express aggressively reduced parameterizations to achieve computational saving and/or speedup in (training) performance---while necessarily retaining (validation accuracy) performance comparable to the standard LSTm RNN. Expand
Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer
  • Daniel Kent, F. Salem
  • Computer Science
  • 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)
  • 2019
Computational analysis of the validation accuracy of a convolutional plus recurrent neural network architecture designed to analyze sentiment, using comparatively the standard LSTM and three Slim LSTm layers finds that some realizations of the Slim L STM layers can potentially perform as well as the standardLSTM layer for the considered architecture targeted at sentiment analysis. Expand
CARU: A Content-Adaptive Recurrent Unit for the Transition of Hidden State in NLP
This article introduces a novel RNN unit inspired by GRU, namely the Content-Adaptive Recurrent Unit (CARU). The design of CARU contains all the features of GRU but requires fewer trainingExpand
Toponym Resolution with Deep Neural Networks
Ricardo Custódio Toponym resolution, i.e. inferring the geographic coordinates of a given string that represents a placename, is a fundamental problem in the context of several applications relatedExpand
Applying Machine Learning to the Task of Generating Search Queries
In this paper we research two modifications of recurrent neural networks – Long Short-Term Memory networks and networks with Gated Recurrent Unit with the addition of an attention mechanism to bothExpand
Airborne particle pollution predictive model using Gated Recurrent Unit (GRU) deep neural networks
A forecasting model using gated Recurrent unit (GRU) and Long-Short Term Memory (LSTM) networks, which are types of a deep recurrent neural network (RNN) is presented, showing that thistype of deep network is feasible for predicting the non-linearities of this type of data. Expand
Gated Recurrent Networks for Video Super Resolution
This work proposes a new Gated Recurrent Convolutional Neural Network for VSR adapting some of the key components of a Gated recurrent Unit, which outperforms current VSR learning based models in terms of perceptual quality and temporal consistency. Expand


Minimal gated unit for recurrent neural networks
This work proposes a gated unit for RNN, named as minimal gated units (MGU), since it only contains one gate, which is a minimal design among all gated hidden units. Expand
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM. Expand
An Empirical Exploration of Recurrent Network Architectures
It is found that adding a bias of 1 to the LSTM's forget gate closes the gap between the L STM and the recently-introduced Gated Recurrent Unit (GRU) on some but not all tasks. Expand
Long Short-Term Memory
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Expand
Adam: A Method for Stochastic Optimization
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Expand
Gradient-based learning applied to document recognition
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand
Nov.). Long shortterm memory
  • Neural Computation
  • 1997
Reduced Paramerization in Gated Recurrent Neural Networks
  • Memorandum 7.11.2016,
  • 2016
Keras : Theanobased deep learning library
  • 2015