• Publications
  • Influence
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
TLDR
We introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. Expand
  • 1,030
  • 256
  • PDF
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
TLDR
We present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. Expand
  • 233
  • 48
  • PDF
Supervised Open Information Extraction
TLDR
We present data and methods that enable a supervised learning approach to Open Information Extraction (Open IE). Expand
  • 84
  • 28
  • PDF
Large-Scale QA-SRL Parsing
TLDR
We present a new large-scale corpus of Question-Answer driven Semantic Role Labeling (QA-SRL) annotations, and the first high-quality QA- SRL parser. Expand
  • 35
  • 9
  • PDF
GLUE : A MultiTask Benchmark and Analysis Platform for Natural Language Understanding
For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way thatExpand
  • 96
  • 8
  • PDF
Crowdsourcing Question-Answer Meaning Representations
TLDR
We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs. Expand
  • 36
  • 7
  • PDF
The Winograd Schema Challenge and Reasoning about Correlation
TLDR
We introduce a framework for reasoning about correlation between sentences, and show how this framework can be used to justify solutions to some Winograd Schema problems. Expand
  • 41
  • 4
  • PDF
Human-in-the-Loop Parsing
TLDR
This paper demonstrates that it is possible for a parser to improve its performance with a human in the loop, by posing simple questions to non-experts. Expand
  • 20
  • 1
  • PDF
AmbigQA: Answering Ambiguous Open-domain Questions
TLDR
We introduce AmbigQA, a new open-domain question answering task which involves predicting a set of question-answer pairs, where every plausible answer is paired with a disambiguated rewrite of the original question. Expand
  • 9
  • 1
  • PDF
Controlled Crowdsourcing for High-Quality QA-SRL Annotation
TLDR
We present an improved crowdsourcing protocol for complex semantic annotation, involving worker selection and training, and a data consolidation phase. Expand
  • 6
  • PDF