Rational Recurrences

@article{Peng2018RationalR,
  title={Rational Recurrences},
  author={Hao Peng and Roy Schwartz and Sam Thomson and Noah A. Smith},
  journal={ArXiv},
  year={2018},
  volume={abs/1808.09357}
}
Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches. [] Key Method We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs. We show that several recent neural models use rational recurrences. Our analysis provides a fresh view of these models and…

Figures and Tables from this paper

Sequential Neural Networks as Automata
  • William Cooper Merrill
  • Computer Science
    Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges
  • 2019
TLDR
This work first defines what it means for a real-time network with bounded precision to accept a language and defines a measure of network memory, which helps explain neural computation, as well as the relationship between neural networks and natural language grammar.
Sequential Neural Networks as Automata Anonymous ACL submission
  • Computer Science
  • 2019
TLDR
This work first defines what it means for a real-time network with bounded precision to accept a language and defines a measure of network memory, which helps explain neural computation, as well as the relationship between neural networks and natural language grammar.
RNN Architecture Learning with Sparse Regularization
TLDR
This work applies group lasso to rational RNNs (Peng et al., 2018), a family of models that is closely connected to weighted finite-state automata (WFSAs) and shows that sparsifying such models makes them easier to visualize, and presents models that rely exclusively on as few as three WFSAs after pruning more than 90% of the weights.
A Lightweight Recurrent Network for Sequence Modeling
TLDR
This paper proposes a lightweight recurrent network, or LRN, which uses input and forget gates to handle long-range dependencies as well as gradient vanishing and explosion, with all parameterrelated calculations factored outside the recurrence.
A Lightweight Recurrent Network for Sequence Modeling
TLDR
This paper proposes a lightweight recurrent network, or LRN, which uses input and forget gates to handle long-range dependencies as well as gradient vanishing and explosion, with all parameter related calculations factored outside the recurrence.
Cold-start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks
TLDR
FA-RNNs are proposed, a type of recurrent neural networks that combine the advantages of neural networks and regular expression rules that significantly outperform previous neural approaches in both zero-shot and low-resource settings and remain very competitive in rich- resource settings.
Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
TLDR
It is hypothesize that altering BERT to better align with brain recordings would enable it to also better understand language, and closes the loop to allow the interaction between NLP and cognitive neuroscience to be a true cross-pollination.
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
TLDR
A fundamental connection between weighted finite automata~(WFAs) and second-order recurrent neural networks~(2-RNNs) is unraveled and the first provable learning algorithm for linear 2- RNNs defined over sequences of continuous input vectors is proposed.
Neural Finite-State Transducers: Beyond Rational Relations
TLDR
Neural finite state transducers are introduced, a family of string transduction models defining joint and conditional probability distributions over pairs of strings that compete favorably against seq2seq models while offering interpretable paths that correspond to hard monotonic alignments.
Training RNNs as Fast as CNNs
TLDR
The Simple Recurrent Unit architecture is proposed, a recurrent unit that simplifies the computation and exposes more parallelism, and is as fast as a convolutional layer and 5-10x faster than an optimized LSTM implementation.
...
1
2
3
...

References

SHOWING 1-10 OF 72 REFERENCES
Strongly-Typed Recurrent Neural Networks
TLDR
Ideas from physics and functional programming are imported into RNN design to provide guiding principles and, despite being more constrained, strongly-typed architectures achieve lower training and comparable generalization error to classical architectures.
Bridging CNNs, RNNs, and Weighted Finite-State Machines
TLDR
SoPa combines neural representation learning with weighted finite-state automata (WFSAs) to learn a soft version of traditional surface patterns, and it is shown that SoPa is an extension of a one-layer CNN, and that such CNNs are equivalent to a restricted version of SoPa, and accordingly, to arestricted form of WFSA.
Neural Architecture Search with Reinforcement Learning
TLDR
This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.
Learning Longer Memory in Recurrent Neural Networks
TLDR
This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture.
Recurrent Neural Networks as Weighted Language Recognizers
TLDR
It is shown that approximations and heuristic algorithms are necessary in practical applications of single-layer, ReLU-activation, rational-weight RNNs with softmax, which are commonly used in natural language processing applications.
On the State of the Art of Evaluation in Neural Language Models
TLDR
This work reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrives at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models.
A Primer on Neural Network Models for Natural Language Processing
TLDR
This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.
Intelligible Language Modeling with Input Switched Affine Networks
TLDR
A recurrent architecture composed of input-switched affine transformations, in other words an RNN without any nonlinearity and with one set of weights per input, which achieves near identical performance on language modeling of Wikipedia text.
Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks
TLDR
It is shown that a recurrent, second-order neural network using a real-time, forward training algorithm readily learns to infer small regular grammars from positive and negative string training samples, and many of the neural net state machines are dynamically stable, that is, they correctly classify many long unseen strings.
Attention is All you Need
TLDR
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
...
1
2
3
4
5
...