Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM

@article{Schmidhuber2002LearningNL,
  title={Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM},
  author={J{\"u}rgen Schmidhuber and Felix A. Gers and Douglas Eck},
  journal={Neural Computation},
  year={2002},
  volume={14},
  pages={2039-2041}
}
In response to Rodriguez's recent article (2001), we compare the performance of simple recurrent nets and long short-term memory recurrent nets on context-free and context-sensitive languages. 
Improving procedures for evaluation of connectionist context-free language predictors
This letter shows how seemingly minor differences in training and evaluation procedures used in recent studies of recurrent neural networks as context free language predictors can lead to significant
Recurrent Residual Network
  • Computer Science
  • 2017
TLDR
The recurrent residual network is introduced which is a combination of the residual network and the long short term memory network and modified by adding residual links between nonadjacent layers.
CU-NLP at SemEval-2016 Task 8: AMR Parsing using LSTM-based Recurrent Neural Networks
TLDR
This parser does not rely on a syntactic pre-parse, or heavily engineered features, and uses five recurrent neural networks as the key architectural components for estimating AMR graph structure.
Incremental Learning for RNNs: How Does it Affect Performance and Hidden Unit Activation?
TLDR
The differences between incremental and non-incremental learning – with respect to success rate, generalisation performance, and characteristics of hidden unit activation are highlighted.
Abstract Meaning Representation Parsing using LSTM Recurrent Neural Networks
TLDR
The proposed AMR parser does not rely on a syntactic pre-parse, or heavily engineered features, and uses five recurrent neural networks as the key architectural components for inferring AMR graphs.
Improved access to sequential motifs: a note on the architectural bias of recurrent networks
TLDR
By experimentation, it is shown that the bias of recurrent neural networks-recently analyzed by Tino et al. and Hammer and Tino-offers superior access to motifs compared to the standardly used feedforward neural networks.
Subregular Complexity and Deep Learning
This paper argues that the judicial use of formal language theory and grammatical inference are invaluable tools in understanding how deep neural networks can and cannot represent and learn long-term
A Connectionist Approach to Learn Marathi Language
  • S. Kolhe, B. Pawar
  • Computer Science
    2008 First International Conference on Emerging Trends in Engineering and Technology
  • 2008
TLDR
This empirical study considers the task of classifying Marathi language sentences as grammatical or ungrammatical as well as modeled the problem as a prediction problem and analyzed the operation of the networks by investing rule approximation.
State-Regularized Recurrent Neural Networks
TLDR
It is shown that state-regularization simplifies the extraction of finite state automata modeling an RNN's state transition dynamics and forces RNNs to operate more like automata with external memory and less like finite state machines, which makes Rnns have better interpretability and explainability.
Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages
TLDR
This work provides the first demonstration of neural networks recognizing the generalized Dyck languages, which express the core of what it means to be a language with hierarchical structure.
...
...

References

SHOWING 1-10 OF 11 REFERENCES
LSTM recurrent networks learn simple context-free and context-sensitive languages
TLDR
Long short-term memory (LSTM) variants are also the first RNNs to learn a simple context-sensitive language, namely a(n)b( n)c(n).
Improving procedures for evaluation of connectionist context-free language predictors
This letter shows how seemingly minor differences in training and evaluation procedures used in recent studies of recurrent neural networks as context free language predictors can lead to significant
Long Short-Term Memory Learns Context Free and Context Sensitive Languages
TLDR
LSTM variants are also the first RNNs to learn a context sensitive language (\mbox{CSL}), namely $a^nb^n c^n$.
Simple Recurrent Networks Learn Context-Free and Context-Sensitive Languages by Counting
TLDR
A range of language tasks are shown in which an SRN develops solutions that not only count but also copy and store counting information, demonstrating how SRNs may be an alternative psychological model of language or sequence processing.
Context-free and context-sensitive dynamics in recurrent neural networks
TLDR
The dynamics in recurrent neural networks that process context-free languages can also be employed in processing some context-sensitive languages, and this continuity of mechanism between language classes contributes to the understanding of neural networks in modelling language learning and processing.
Long Short-Term Memory
TLDR
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Finding Structure in Time
TLDR
A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.
Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies
D3EGF(FIH)J KMLONPEGQSRPETN UCV.WYX(Z R.[ V R6\M[ X N@]_^O\`JaNcb V RcQ W d EGKeL(^(QgfhKeLOE?i)^(QSj ETNPfPQkRl[ V R)m"[ X ^(KeLOEG^ npo qarpo m"[ X ^(KeLOEG^tsAu EGNPb V ^ v wyx
Untersuchungen zu dynamischen neuronalen Netzen
Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut für Informatik, Technische Universität München. Available on-line: ni.cs.tu-berlin.de/∼hochreit
  • 1991
...
...