On the Practical Computational Power of Finite Precision RNNs for Language Recognition

@inproceedings{Weiss2018OnTP,
  title={On the Practical Computational Power of Finite Precision RNNs for Language Recognition},
  author={Gail Weiss and Y. Goldberg and Eran Yahav},
  booktitle={ACL},
  year={2018}
}
While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. [...] Key Result We show empirically that the LSTM does indeed learn to effectively use the counting mechanism.Expand
On the Computational Power of RNNs
TLDR
It is proved that finite precision RNNs with one hidden layer and ReLU activation and finite precision GRUs are exactly as computationally powerful as deterministic finite automata. Expand
On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages
TLDR
This work studies the performance of recurrent models on Dyck-n languages, a particularly important and well-studied class of CFLs, and finds that while recurrent models generalize nearly perfectly if the lengths of the training and test strings are from the same range, they perform poorly if the teststrings are longer. Expand
On Evaluating the Generalization of LSTM Models in Formal Languages
Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established themselves as a dominant model for language processing. Yet, there still remains an uncertainty regarding theirExpand
On Evaluating the Generalization of LSTM Models in Formal Languages
TLDR
This paper empirically evaluates the inductive learning capabilities of Long Short-Term Memory networks, a popular extension of simple RNNs, to learn simple formal languages. Expand
RNNs Can Generate Bounded Hierarchical Languages with Optimal Memory
TLDR
Dyck- is introduced, the language of well-nested brackets and nesting depth, reflecting the bounded memory needs and long-distance dependencies of natural language syntax, and it is proved that an RNN with $O(m \log k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction. Expand
LSTM Networks Can Perform Dynamic Counting
TLDR
This work is the first study to introduce the shuffle languages to analyze the computational power of neural networks, and shows that a single-layer LSTM with only one hidden unit is practically sufficient for recognizing the Dyck-1 language. Expand
On the Turing Completeness of Modern Neural Network Architectures
TLDR
This study studies the computational power of two of the most paradigmatic architectures exemplifying these mechanisms: the Transformer and the Neural GPU, and shows both models to be Turing complete exclusively based on their capacity to compute and access internal dense representations of the data. Expand
Sequential neural networks as automata
In recent years, neural network architectures for sequence modeling have been applied with great success to a variety of NLP tasks. What neural networks provide in performance, however, they lack inExpand
State-Regularized Recurrent Neural Networks
TLDR
It is shown that state-regularization simplifies the extraction of finite state automata modeling an RNN's state transition dynamics and forces RNNs to operate more like automata with external memory and less like finite state machines, which makes Rnns have better interpretability and explainability. Expand
Sequential Neural Networks as Automata
  • William Cooper Merrill
  • Computer Science
  • Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges
  • 2019
TLDR
This work first defines what it means for a real-time network with bounded precision to accept a language and defines a measure of network memory, which helps explain neural computation, as well as the relationship between neural networks and natural language grammar. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 20 REFERENCES
Recurrent Neural Networks as Weighted Language Recognizers
TLDR
It is shown that approximations and heuristic algorithms are necessary in practical applications of single-layer, ReLU-activation, rational-weight RNNs with softmax, which are commonly used in natural language processing applications. Expand
On the Computational Power of Neural Nets
TLDR
It is proved that one may simulate all Turing machines by such nets, and any multi-stack Turing machine in real time, and there is a net made up of 886 processors which computes a universal partial-recursive function. Expand
On the computational power of neural nets
TLDR
It is proved that one may simulate all Turing Machines by rational nets in linear time, and there is a net made up of about 1,000 processors which computes a universal partial-recursive function. Expand
Recurrent nets that time and count
  • F. Gers, J. Schmidhuber
  • Computer Science
  • Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium
  • 2000
TLDR
Surprisingly, LSTM augmented by "peephole connections" from its internal cells to its multiplicative gates can learn the fine distinction between sequences of spikes separated by either 50 or 49 discrete time steps, without the help of any short training exemplars. Expand
LSTM: A Search Space Odyssey
TLDR
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment. Expand
LSTM recurrent networks learn simple context-free and context-sensitive languages
TLDR
Long short-term memory (LSTM) variants are also the first RNNs to learn a simple context-sensitive language, namely a(n)b( n)c(n). Expand
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
TLDR
This paper proposes a simpler solution that use recurrent neural networks composed of rectified linear units that is comparable to LSTM on four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem. Expand
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
TLDR
These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM. Expand
Analog Computation via Neural Networks
TLDR
It is noted that these networks are not likely to solve polynomially NP-hard problems, as the equality “ p = np ” in the model implies the almost complete collapse of the standard polynomial hierarchy. Expand
RECURRENT NEURAL NETWORKS AND FINITE AUTOMATA
TLDR
Finite size networks that consist of interconnections of synchronously evolving processors are studied to prove that any function for which the left and right limits exist can be applied to the neurons to yield a network which is at least as strong computationally as a finite automaton. Expand
...
1
2
...