# Bridging CNNs, RNNs, and Weighted Finite-State Machines

@article{Schwartz2018BridgingCR, title={Bridging CNNs, RNNs, and Weighted Finite-State Machines}, author={Roy Schwartz and Sam Thomson and Noah A. Smith}, journal={ArXiv}, year={2018}, volume={abs/1805.06061} }

Recurrent and convolutional neural networks comprise two distinct families of models that have proven to be useful for encoding natural language utterances. In this paper we present SoPa, a new model that aims to bridge these two approaches. SoPa combines neural representation learning with weighted finite-state automata (WFSAs) to learn a soft version of traditional surface patterns. We show that SoPa is an extension of a one-layer CNN, and that such CNNs are equivalent to a restricted version…

## 22 Citations

### Rational Recurrences

- Computer ScienceEMNLP
- 2018

It is shown that several recent neural models use rational recurrences, and one such model is presented, which performs better than two recent baselines on language modeling and text classification and demonstrates that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.

### RNN Architecture Learning with Sparse Regularization

- Computer ScienceEMNLP
- 2019

This work applies group lasso to rational RNNs (Peng et al., 2018), a family of models that is closely connected to weighted finite-state automata (WFSAs) and shows that sparsifying such models makes them easier to visualize, and presents models that rely exclusively on as few as three WFSAs after pruning more than 90% of the weights.

### A Formal Hierarchy of RNN Architectures

- Computer ScienceACL
- 2020

It is hypothesized that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy, and empirical results to support this conjecture are provided.

### Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

- Computer ScienceAAAI
- 2020

This work presents a method to extract a weighted finite automaton (WFA) from a recurrent neural network (RNN) based on the WFA learning algorithm by Balle and Mohri, which is an extension of Angluin's classic L* algorithm.

### Neural Finite-State Transducers: Beyond Rational Relations

- Computer ScienceNAACL
- 2019

Neural finite state transducers are introduced, a family of string transduction models defining joint and conditional probability distributions over pairs of strings that compete favorably against seq2seq models while offering interpretable paths that correspond to hard monotonic alignments.

### Cold-start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks

- Computer ScienceEMNLP
- 2020

FA-RNNs are proposed, a type of recurrent neural networks that combine the advantages of neural networks and regular expression rules that significantly outperform previous neural approaches in both zero-shot and low-resource settings and remain very competitive in rich- resource settings.

### The Role of Interpretable Patterns in Deep Learning for Morphology

- Computer ScienceArXiv
- 2020

A modified version of the standard sequence-to-sequence model, where the encoder is a pattern matching network, and each pattern scores all possible N character long subwords on the source side, and the highest scoring subword's score is used to initialize the decoder as well as the input to the attention mechanism.

### The Surprising Computational Power of Nondeterministic Stack RNNs

- Computer ScienceArXiv
- 2022

This paper shows that nondeterminism and the neural controller interact to produce two more unexpected abilities of the nondeterministic stack RNN, which can recognize not only CFLs, but also many non-context-free languages.

### Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

- Computer ScienceICML
- 2022

This paper presents R ETO M ATON – retrieval automaton, which approximates the datastore search, based on (1) saving pointers between consecutivedatastore entries, and (2) clustering of entries into “states”.

### Higher-order Derivatives of Weighted Finite-state Machines

- Computer ScienceACL
- 2021

This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines and provides a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature.

## References

SHOWING 1-10 OF 65 REFERENCES

### Recurrent Neural Networks as Weighted Language Recognizers

- Computer ScienceNAACL
- 2018

It is shown that approximations and heuristic algorithms are necessary in practical applications of single-layer, ReLU-activation, rational-weight RNNs with softmax, which are commonly used in natural language processing applications.

### Recurrent Additive Networks

- Computer ScienceArXiv
- 2017

Recurrent additive networks are introduced, a new gated RNN which is distinguished by the use of purely additive latent state updates, and it is formally shown that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums.

### Convolutional Sequence to Sequence Learning

- Computer ScienceICML
- 2017

This work introduces an architecture based entirely on convolutional neural networks, which outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT-French translation at an order of magnitude faster speed, both on GPU and CPU.

### Weighting Finite-State Transductions With Neural Context

- Computer ScienceNAACL
- 2016

This work proposes to keep the traditional architecture, which uses a finite-state transducer to score all possible output strings, but to augment the scoring function with the help of recurrent networks, and defines a probability distribution over aligned output strings in the form of a weighted finite- state automaton.

### Character-Aware Neural Language Models

- Computer ScienceAAAI
- 2016

A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

### A Primer on Neural Network Models for Natural Language Processing

- Computer ScienceJ. Artif. Intell. Res.
- 2016

This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.

### Deep Unordered Composition Rivals Syntactic Methods for Text Classification

- Computer ScienceACL
- 2015

This work presents a simple deep neural network that competes with and, in some cases, outperforms such models on sentiment analysis and factoid question answering tasks while taking only a fraction of the training time.

### Modeling Skip-Grams for Event Detection with Convolutional Neural Networks

- Computer ScienceEMNLP
- 2016

This work proposes to improve the current CNN models for ED by introducing the non-consecutive convolution, leading to the significant performance improvement over the current state-of-the-art systems.

### Representation of Linguistic Form and Function in Recurrent Neural Networks

- Computer Science, LinguisticsCL
- 2017

A method for estimating the amount of contribution of individual tokens in the input to the final prediction of the networks is proposed and shows that the Visual pathway pays selective attention to lexical categories and grammatical functions that carry semantic information, and learns to treat word types differently depending on their grammatical function and their position in the sequential structure of the sentence.

### Understanding Neural Networks through Representation Erasure

- Computer ScienceArXiv
- 2016

This paper proposes a general methodology to analyze and interpret decisions from a neural model by observing the effects on the model of erasing various parts of the representation, such as input word-vector dimensions, intermediate hidden units, or input words.