• Publications
  • Influence
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
TLDR
A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks. Expand
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
TLDR
A new benchmark styled after GLUE is presented, a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard are presented. Expand
Supervised Open Information Extraction
TLDR
A novel formulation of Open IE as a sequence tagging problem, addressing challenges such as encoding multiple extractions for a predicate, and a supervised model that outperforms the existing state-of-the-art Open IE systems on benchmark datasets. Expand
Large-Scale QA-SRL Parsing
TLDR
A new large-scale corpus of Question-Answer driven Semantic Role Labeling (QA-SRL) annotations, and the first high-quality QA- SRL parser are presented, and neural models for two QA -SRL subtasks are presented: detecting argument spans for a predicate and generating questions to label the semantic relationship. Expand
Crowdsourcing Question-Answer Meaning Representations
TLDR
A crowdsourcing scheme is developed to show that QAMRs can be labeled with very little training, and a qualitative analysis demonstrates that the crowd-generated question-answer pairs cover the vast majority of predicate-argument relationships in existing datasets. Expand
AmbigQA: Answering Ambiguous Open-domain Questions
TLDR
This paper introduces AmbigQA, a new open-domain question answering task which involves predicting a set of question-answer pairs, where every plausible answer is paired with a disambiguated rewrite of the original question. Expand
The Winograd Schema Challenge and Reasoning about Correlation
TLDR
A framework for reasoning about correlation between sentences is introduced, and it is shown how this framework can be used to justify solutions to some Winograd Schema problems. Expand
Asking without Telling: Exploring Latent Ontologies in Contextual Representations
TLDR
This work introduces latent subclass learning (LSL): a modification to existing classifier-based probing methods that induces a latent categorization (or ontology) of the probe's inputs that extracts emergent structure from input representations in an interpretable and quantifiable form. Expand
Human-in-the-Loop Parsing
TLDR
This paper demonstrates that it is possible for a parser to improve its performance with a human in the loop, by posing simple questions to non-experts, and applies the approach to a CCG parser, converting uncertain attachment decisions into natural language questions about the arguments of verbs. Expand
Controlled Crowdsourcing for High-Quality QA-SRL Annotation
TLDR
An improved crowdsourcing protocol for complex semantic annotation, involving worker selection and training, and a data consolidation phase is presented, which yielded high-quality annotation with drastically higher coverage, producing a new gold evaluation dataset. Expand
...
1
2
...