Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax

  title={Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax},
  author={Jean-Philippe Bernardy and Shalom Lappin},
We show that both an LSTM and a unitary-evolution recurrent neural network (URN) can achieve encouraging accuracy on two types of syntactic patterns: context-free long distance agreement, and mildly context-sensitive cross serial dependencies. This work extends recent experiments on deeply nested context-free long distance dependencies, with similar results. URNs differ from LSTMs in that they avoid non-linear activation functions, and they apply matrix multiplication to word embeddings encoded… 

Figures and Tables from this paper



A Neural Model for Compositional Word Embeddings and Sentence Processing

A new neural model for word embeddings is proposed, which uses Unitary Matrices as the primary device for encoding lexical information, and goes some way towards offering a class of computationally powerful deep learning systems that can be fully understood and compared to human cognitive processes for natural language learning and representation.

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

It is concluded that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

Evaluating the Ability of LSTMs to Learn Context-Free Grammars

It is concluded that LSTMs do not learn the relevant underlying context-free rules, suggesting the good overall performance is attained rather by an efficient way of evaluating nuisance variables.

Using Deep Neural Networks to Learn Syntactic Agreement

DNNs require large vocabularies to form substantive lexical embeddings in order to learn structural patterns, and this finding has interesting consequences for the understanding of the way in which DNNs represent syntactic information.

Colorless Green Recurrent Networks Dream Hierarchically

The authors' language-model-trained RNNs make reliable predictions about long-distance agreement, and do not lag much behind human performance, bringing support to the hypothesis that RNN's are not just shallow-pattern extractors, but they also acquire deeper grammatical competence.

Learning the Dyck Language with Attention-based Seq2Seq Models

It is revealed that attention mechanisms still cannot truly generalize over the recursion depth, although they perform much better than other models on the closing bracket tagging task, which suggests that this commonly used task is not sufficient to test a model’s understanding of CFGs.

Distributed representations, simple recurrent networks, and grammatical structure

AbstractIn this paper three problems for a connectionist account of language are considered1.What is the nature of linguistic representations?2.How can complex structural relationships such as

Generating Text with Recurrent Neural Networks

The power of RNNs trained with the new Hessian-Free optimizer by applying them to character-level language modeling tasks is demonstrated, and a new RNN variant that uses multiplicative connections which allow the current input character to determine the transition matrix from one hidden state vector to the next is introduced.

RNNs Can Generate Bounded Hierarchical Languages with Optimal Memory

Dyck- is introduced, the language of well-nested brackets and nesting depth, reflecting the bounded memory needs and long-distance dependencies of natural language syntax, and it is proved that an RNN with $O(m \log k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction.

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

This work presents a new architecture for implementing an Efficient Unitary Neural Network (EUNNs), and finds that this architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed.