Neural language models as psycholinguistic subjects: Representations of syntactic state

@inproceedings{Futrell2019NeuralLM,
  title={Neural language models as psycholinguistic subjects: Representations of syntactic state},
  author={Richard Futrell and Ethan Gotlieb Wilcox and Takashi Morita and Peng Qian and Miguel Ballesteros and R. Levy},
  booktitle={NAACL},
  year={2019}
}
We investigate the extent to which the behavior of neural network language models reflects incremental representations of syntactic state. To do so, we employ experimental methodologies which were originally developed in the field of psycholinguistics to study syntactic representation in the human mind. We examine neural network model behavior on sets of artificial sentences containing a variety of syntactically complex structures. These sentences not only test whether the networks have a… Expand
A Targeted Assessment of Incremental Processing in Neural LanguageModels and Humans
TLDR
It is shown that models systematically under-predict the difference in magnitude of incremental processing difficulty between grammatical and ungrammatical sentences, which calls into question whether contemporary language models are approaching human-like performance for sensitivity to syntactic violations. Expand
A Systematic Assessment of Syntactic Generalization in Neural Language Models
TLDR
A systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites finds substantial differences in syntactic generalization performance by model architecture. Expand
Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models
TLDR
This work uses a gradient similarity metric to demonstrate that LSTM LMs' representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that the LMs track abstract properties of the sentence. Expand
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
TLDR
Results from a series of probes designed to test the sensitivity of vector-based language representations from pretrained language models suggest that Transformers build sensitivity to larger parts of the sentence along their layers, and that hierarchical phrase structure plays a role in this process. Expand
Syntactic Persistence in Language Models: Priming as a Window into Abstract Language Representations
TLDR
It is concluded that the syntactic priming paradigm is a highly useful, additional tool for gaining insights into the capacities of language models. Expand
Exploring Processing of Nested Dependencies in Neural-Network Language Models and Humans
TLDR
This study studied whether a recurrent neural network with Long Short-Term Memory units can mimic a central aspect of human sentence processing, namely the handling of long-distance agreement dependencies. Expand
Mechanisms for handling nested dependencies in neural-network language models and humans
TLDR
This study studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement. Expand
Transformers in the loop: Polarity in neural models of language
TLDR
This work probes polarity via so-called ’negative polarity items’ in two pre-trained Transformer-based models and shows that metrics derived from language models are more consistent with data from psycholinguistic experiments than linguistic theory predictions. Expand
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplishExpand
Comparing Gated and Simple Recurrent Neural Network Architectures as Models of Human Sentence Processing
TLDR
While the gated networks provide better language models, they do not outperform their SRN counterpart as cognitive models when language model quality is equal across network types, suggesting that the different architectures are equally valid as models of human sentence processing. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 47 REFERENCES
Using Deep Neural Networks to Learn Syntactic Agreement
TLDR
DNNs require large vocabularies to form substantive lexical embeddings in order to learn structural patterns, and this finding has interesting consequences for the understanding of the way in which DNNs represent syntactic information. Expand
Targeted Syntactic Evaluation of Language Models
TLDR
In an experiment using this data set, an LSTM language model performed poorly on many of the constructions, and a large gap remained between its performance and the accuracy of human participants recruited online. Expand
Finding syntax in human encephalography with beam search
TLDR
This pattern of results recommends the RNNG+beam search combination as a mechanistic model of the syntactic processing that occurs during normal human language comprehension. Expand
Toward a connectionist model of recursion in human linguistic performance
TLDR
This work suggests a novel explanation of people's limited recursive performance, without assuming the existence of a mentally represented competence grammar allowing unbounded recursion. Expand
Colorless green recurrent networks dream hierarchically
TLDR
Support is brought to the hypothesis that RNNs are not just shallow-pattern extractors, but they also acquire deeper grammatical competence by making reliable predictions about long-distance agreement and do not lag much behind human performance. Expand
A Neural Model of Adaptation in Reading
TLDR
The addition of a simple adaptation mechanism to a neural language model improves the predictions of human reading times compared to a non-adaptive model and adapts not only to lexical items but also to abstract syntactic structures. Expand
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
TLDR
It is concluded that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured. Expand
A Probabilistic Earley Parser as a Psycholinguistic Model
TLDR
Under grammatical assumptions supported by corpus-frequency data, the operation of Stolcke's probabilistic Earley parser correctly predicts processing phenomena associated with garden path structural ambiguity and with the subject/object relative asymmetry. Expand
Neural Network Methods for Natural Language Processing
TLDR
This book focuses on the application of neural network models to natural language data, and introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural Networks, conditioned-generation models, and attention-based models. Expand
The role of structural prediction in rapid syntactic analysis
TLDR
In an event-related potentials (ERP) study of visual sentence reading participants encountered violations of a word order constraint (...Max's of...) that has elicited early ERP responses in previous studies, which suggests a role for structural expectations in accounting for very fast syntactic diagnosis processes. Expand
...
1
2
3
4
5
...