• Corpus ID: 236133988

Different kinds of cognitive plausibility: why are transformers better than RNNs at predicting N400 amplitude?

  title={Different kinds of cognitive plausibility: why are transformers better than RNNs at predicting N400 amplitude?},
  author={James A. Michaelov and Megan D. Bardolph and Seana Coulson and Benjamin K. Bergen},
Despite being designed for performance rather than cognitive plausibility, transformer language models have been found to be better at predicting metrics used to assess human language comprehension than language models with other architectures, such as recurrent neural networks. Based on how well they predict the N400, a neural signal associated with processing difficulty, we propose and provide evidence for one possible explanation—their predictions are affected by the preceding context in a… 

Figures from this paper

Context Limitations Make Neural Language Models More Human-Like

Language models (LMs) have been used in cognitive modeling as well as engineering studies—they compute information-theoretic complexity metrics that simulate humans’ cognitive load during reading.

Collateral facilitation in humans and language models

Are the predictions of humans and language models affected by similar things? Research suggests that while comprehending language, humans make predictions about upcoming words, with more predictable

So Cloze yet so Far: N400 Amplitude is Better Predicted by Distributional Information than Human Predictability Judgements

It is found that the predictions of three top-ofthe-line contemporary language models—GPT-3, RoBERTa, and ALBERT—match the N400 more closely than human predictions, suggesting that the predictive processes underlying the N 400 may be more sensitive to the surface-level statistics of language than previously thought.

Testing the limits of natural language models for predicting human language judgments

This work compared the model-human consistency of diverse language models using a novel experimental approach: controversial sentence pairs, which proved highly effective at revealing model failures and identifying models that aligned most closely with human judgments.

Patterns of Text Readability in Human and Predicted Eye Movements

It has been shown that multilingual transformer models are able to predict human reading behavior when fine-tuned on small amounts of eye tracking data. As the cumulated prediction results do not



How well does surprisal explain N400 amplitude under different experimental conditions?

It is found that surprisal can predict N400 amplitude in a wide range of cases, and the cases where it cannot do so provide valuable insight into the neurocognitive processes underlying the response.

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

A suite of diagnostics drawn from human language experiments are introduced, which allow us to ask targeted questions about information used by language models for generating predictions in context, and the popular BERT model is applied.

Cognitively Plausible Models of Human Language Processing

The challenge is to build models that integrate multiple aspects of human language processing at the syntactic, semantic, and discourse level that should be incremental, predictive, broad coverage, and robust to noise.

A Tale of Two Positivities (and the N400): Distinct neural signatures are evoked by confirmed and violated predictions at different levels of representation

It is suggested that the late posterior positivity/P600 is triggered when the comprehender detects a conflict between the input and her model of the communicator and communicative environment, and provides strong evidence that confirmed and violated predictions at different levels of representation manifest as distinct spatiotemporal neural signatures.

Separate streams or probabilistic inference? What the N400 can tell us about the comprehension of events

  • G. Kuperberg
  • Computer Science
    Language, cognition and neuroscience
  • 2016
It is argued that the computational principles of this framework can be extended to understand how the authors infer situation models during discourse comprehension, and intended messages during spoken communication.

Comprehending surprising sentences: sensitivity of post-N400 positivities to contextual congruity and semantic relatedness

ABSTRACT Any proposal for predictive language comprehension must address receipt of less expected information. While a relationship between the N400 and sentence predictability is well established, a

Comparing Transformers and RNNs on predicting human sentence processing data

This paper trains both Transformer and RNN based language models and compares their performance as a model of human sentence processing and shows that the Transformers outperform the RNNs as cognitive models in explaining self-paced reading times and N400 strength but not gaze durations from an eye-tracking experiment.

Using Language Models and Latent Semantic Analysis to Characterise the N400m Neural Response

This paper investigates whether predictors derived from Latent Semantic Analysis, language models, and Roark’s parser are significant in modeling of the N400m and shows that predictors based on the 4-gram language model and the pairwise-priming language model are highly correlated with the manual annotation of contextual plausibility.