Colorless green recurrent networks dream hierarchically

@inproceedings{Gulordava2018ColorlessGR,
  title={Colorless green recurrent networks dream hierarchically},
  author={Kristina Gulordava and Piotr Bojanowski and Edouard Grave and Tal Linzen and Marco Baroni},
  booktitle={NAACL-HLT},
  year={2018}
}
Recurrent neural networks (RNNs) have achieved impressive results in a variety of linguistic processing tasks, suggesting that they can induce non-trivial properties of language. We investigate here to what extent RNNs learn to track abstract hierarchical syntactic structure. We test whether RNNs trained with a generic language modeling objective in four languages (Italian, English, Hebrew, Russian) can predict long-distance number agreement in various constructions. We include in our… 
Priorless Recurrent Networks Learn Curiously
TLDR
It is shown that domain-general recurrent neural networks will also learn number agreement within unnatural sentence structures, i.e. structures that are not found within any natural languages and which humans struggle to process.
Deep RNNs Encode Soft Hierarchical Syntax
TLDR
A set of experiments is presented to demonstrate that deep recurrent neural networks learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision, indicating that a soft syntactic hierarchy emerges.
RNNs as psycholinguistic subjects: Syntactic state and grammatical dependency
TLDR
It is demonstrated that these models represent and maintain incremental syntactic state, but that they do not always generalize in the same way as humans.
Can RNNs learn Recursive Nested Subject-Verb Agreements?
TLDR
A new framework to study recursive processing in RNNs is presented, using subject-verb agreement as a probe into the representations of the neural network, which indicates how neural networks may extract bounded nested tree structures, without learning a systematic recursive rule.
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
TLDR
A new architecture is proposed, the Decay RNN, which incorporates the decaying nature of neuronal activations and models the excitatory and inhibitory connections in a population of neurons and shows competitive performance relative to LSTMs on subject-verb agreement, sentence grammaticality, and language modeling tasks.
What Syntactic Structures block Dependencies in RNN Language Models?
TLDR
This paper demonstrates that two state-of-the-art RNN models are able to maintain the filler--gap dependency through unbounded sentential embeddings and are also sensitive to the hierarchical relationship between the filler and the gap, known as syntactic islands.
Can LSTM Learn to Capture Agreement? The Case of Basque
TLDR
It is found that sequential models perform worse on agreement prediction in Basque than one might expect on the basis of a previous agreement prediction work in English.
Mechanisms for handling nested dependencies in neural-network language models and humans
TLDR
This study studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement.
Hierarchy or Heuristic ? Examining hierarchical structure and the poverty of the stimulus in recurrent neural networks
The impressive successes of recurrent neural networks (RNNs) in natural language tasks has led to a new field of research, examining the ways in which RNN models exhibit aspects of the human grammar
Do RNNs learn human-like abstract word order preferences?
TLDR
The results show that RNNs learn the abstract features of weight, animacy, and definiteness which underlie soft constraints on syntactic alternations.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 47 REFERENCES
Using Deep Neural Networks to Learn Syntactic Agreement
TLDR
DNNs require large vocabularies to form substantive lexical embeddings in order to learn structural patterns, and this finding has interesting consequences for the understanding of the way in which DNNs represent syntactic information.
Exploring the Syntactic Abilities of RNNs with Multi-task Learning
TLDR
It is shown that easily available agreement training data can improve performance on other syntactic tasks, in particular when only a limited amount of training data is available for those tasks, and the multi-task paradigm can be leveraged to inject grammatical knowledge into language models.
Memory Architectures in Recurrent Neural Network Language Models
TLDR
The results demonstrate the value of stack-structured memory for explaining the distribution of words in natural language, in line with linguistic theories claiming a context-free backbone for natural language.
Simple Recurrent Networks and Natural Language: How Important is Starting Small?
TLDR
Evidence is reported that starting with simplified inputs is not necessary in training recurrent networks to learn pseudo-natural languages and it is suggested that the structure of natural language can be learned without special teaching methods or limited cognitive resources.
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
TLDR
It is concluded that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.
Distributed representations, simple recurrent networks, and grammatical structure
AbstractIn this paper three problems for a connectionist account of language are considered1.What is the nature of linguistic representations?2.How can complex structural relationships such as
Toward a connectionist model of recursion in human linguistic performance
TLDR
This work suggests a novel explanation of people’s limited recursive performance, without assuming the existence of a mentally represented competence grammar allowing unbounded recursion.
Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks
TLDR
This work proposes a framework that facilitates better understanding of the encoded representations of sentence vectors and demonstrates the potential contribution of the approach by analyzing different sentence representation mechanisms.
What do Neural Machine Translation Models Learn about Morphology?
TLDR
This work analyzes the representations learned by neural MT models at various levels of granularity and empirically evaluates the quality of the representations for learning morphology through extrinsic part-of-speech and morphological tagging tasks.
Linguistic Regularities in Continuous Space Word Representations
TLDR
The vector-space word representations that are implicitly learned by the input-layer weights are found to be surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset.
...
1
2
3
4
5
...