• Publications
  • Influence
Universal Adversarial Triggers for Attacking and Analyzing NLP
TLDR
We search for universal adversarial triggers: input-agnostic sequences of tokens that trigger a model to produce a specific prediction when concatenated to any input from a dataset. Expand
  • 149
  • 19
  • PDF
Compositional Questions Do Not Necessitate Multi-hop Reasoning
TLDR
We introduce a single-hop BERT-based RC model that achieves 67 F1---comparable to state-of-the-art multi-hop models. Expand
  • 68
  • 17
  • PDF
Pathologies of Neural Models Make Interpretations Difficult
TLDR
We use input reduction, a process that iteratively removes the least important word from the input while maintaining the model’s prediction. Expand
  • 120
  • 13
  • PDF
Pretrained Transformers Improve Out-of-Distribution Robustness
TLDR
We systematically measure out-of-distribution (OOD) generalization for seven NLP datasets by constructing a new robustness benchmark with realistic distribution shifts. Expand
  • 45
  • 10
  • PDF
Do NLP Models Know Numbers? Probing Numeracy in Embeddings
TLDR
We begin by investigating the numerical reasoning capabilities of a state-of-the-art question answering model on the DROP dataset. Expand
  • 60
  • 8
  • PDF
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts
TLDR
Using AutoPrompt, we show that masked language models (MLMs) have an inherent capability to perform sentiment analysis and natural language inference without additional parameters or finetuning. Expand
  • 23
  • 6
  • PDF
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
TLDR
We introduce AllenNLP Interpret, a flexible framework for interpreting NLP models. Expand
  • 40
  • 4
  • PDF
Universal Adversarial Triggers for NLP
TLDR
We search for universal adversarial triggers: input-agnostic sequences of tokens that trigger a model to produce a specific prediction when concatenated to any input from a dataset. Expand
  • 21
  • 4
Extracting Training Data from Large Language Models
TLDR
We perform a training data extraction attack on GPT-2, a language model trained on the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data. Expand
  • 33
  • 3
  • PDF
...
1
2
3
...