• Publications
  • Influence
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularitiesExpand
  • 374
  • 57
  • PDF
Colorless green recurrent networks dream hierarchically
Recurrent neural networks (RNNs) have achieved impressive results in a variety of linguistic processing tasks, suggesting that they can induce non-trivial properties of language. We investigate hereExpand
  • 185
  • 37
  • PDF
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issueExpand
  • 143
  • 29
  • PDF
Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks
Syntactic rules in human language usually refer to the hierarchical structure of sentences. However, the input during language acquisition can often be explained equally well with rules based onExpand
  • 40
  • 8
  • PDF
Targeted Syntactic Evaluation of Language Models
We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, eachExpand
  • 79
  • 6
  • PDF
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance
If the same neural architecture is trained multiple times on the same dataset, will it make similar linguistic generalizations across runs? To study this question, we fine-tuned 100 instances of BERTExpand
  • 19
  • 6
  • PDF
Lexical Preactivation in Basic Linguistic Phrases
Many previous studies have shown that predictable words are read faster and lead to reduced neural activation, consistent with a model of reading in which words are activated in advance of beingExpand
  • 34
  • 4
  • PDF
Issues in evaluating semantic spaces using word analogies
The offset method for solving word analogies has become a standard evaluation tool for vector-space semantic models: it is considered desirable for a space to represent semantic relations asExpand
  • 81
  • 3
  • PDF
Uncertainty and Expectation in Sentence Processing: Evidence From Subcategorization Distributions
There is now considerable evidence that human sentence processing is expectation based: As people read a sentence, they use their statistical experience with their language to generate predictionsExpand
  • 46
  • 3
  • PDF
Targeted Syntactic Evaluation of Language Models
We present a data set for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, eachExpand
  • 31
  • 3
  • PDF