On Extended Long Short-term Memory and Dependent Bidirectional Recurrent Neural Network

  title={On Extended Long Short-term Memory and Dependent Bidirectional Recurrent Neural Network},
  author={Yuanhang Su and Yuzhong Huang and C.-C. Jay Kuo},

A review on the long short-term memory model

A comprehensive review of LSTM’s formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example are presented.

Intelligent sentence completion based on global context dependent recurrent neural network language model

This paper verifies the validity of the proposed global context dependent recurrent neural network language model to obtain the global semantics of the target in the sentence completion task, and shows that the proposed model obtains higher completion accuracy than other language models.

An intelligent Chatbot using deep learning with Bidirectional RNN and attention model

Gated recurrent units and temporal convolutional network for multilabel classification

A new ensemble method for managing multilabel classification is proposed, which combines a set of gated recurrent units and temporal convolutional neural networks trained with variants of the Adam optimization approach, and is shown to outperform the state-of-the-art.

A Review of the Application of Deep Learning in Trajectory Data Mining

The trajectory data is briefly introduced, some applications in trajectory data mining are summarized and the advantages and disadvantages on the frequently used deep learning models in trajectoryData mining are analyzed and some small tricks are put forward to help with some new thoughts to the later research in this field.

Image to Bengali Caption Generation Using Deep CNN and Bidirectional Gated Recurrent Unit

A CNN and Bidirectional GRU architecture is proposed for producing a natural language caption from an image in the Bengali language and Bangladeshi people may use this study to grasp one another better and crack language barriers and increase their cultural understanding.

Heterogeneous Ensemble Deep Learning Model for Enhanced Arabic Sentiment Analysis

An optimized heterogeneous stacking ensemble model that combines three different of pre-trained Deep Learning (DL) models: Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU) in conjunction with three meta-learners Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) in order to enhance model’s performance for predicting Arabic sentiment analysis.



Long Short-Term Memory

A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

Visualizing and Understanding Recurrent Networks

This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.

Learning to Forget: Continual Prediction with LSTM

This work identifies a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset, and proposes a novel, adaptive forget gate that enables an LSTm cell to learn to reset itself at appropriate times, thus releasing internal resources.

How to Construct Deep Recurrent Neural Networks

Two novel architectures of a deep RNN are proposed which are orthogonal to an earlier attempt of stacking multiple recurrent layers to build aDeep RNN, and an alternative interpretation is provided using a novel framework based on neural operators.

LSTM: A Search Space Odyssey

This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.

Bidirectional recurrent neural networks

It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.

Sequence to Sequence Learning with Neural Networks

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

Learning long-term dependencies with gradient descent is difficult

This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods.

Convolutional Sequence to Sequence Learning

This work introduces an architecture based entirely on convolutional neural networks, which outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT-French translation at an order of magnitude faster speed, both on GPU and CPU.