• Publications
  • Influence
ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning
TLDR
Experimental results demonstrate that multitask models that incorporate the hierarchical structure of if-then relation types lead to more accurate inference compared to models trained in isolation, as measured by both automatic and human evaluation. Expand
Social IQA: Commonsense Reasoning about Social Interactions
TLDR
It is established that Social IQa, the first large-scale benchmark for commonsense reasoning about social situations, is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap). Expand
WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale
TLDR
This work introduces WinoGrande, a large-scale dataset of 44k problems, inspired by the original WSC design, but adjusted to improve both the scale and the hardness of the dataset, and establishes new state-of-the-art results on five related benchmarks. Expand
Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
TLDR
This paper introduces Cosmos QA, a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions, and proposes a new architecture that improves over the competitive baselines. Expand
Abductive Commonsense Reasoning
TLDR
This study introduces a challenge dataset, ART, that consists of over 20k commonsense narrative contexts and 200k explanations, and conceptualizes two new tasks -- Abductive NLI: a multiple-choice question answering task for choosing the more likely explanation, and Abduction NLG: a conditional generation task for explaining given observations in natural language. Expand
PIQA: Reasoning about Physical Commonsense in Natural Language
TLDR
The task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA are introduced and analysis about the dimensions of knowledge that existing models lack are provided, which offers significant opportunities for future research. Expand
Adversarial Filters of Dataset Biases
TLDR
This work presents extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks. Expand
Unsupervised Commonsense Question Answering with Self-Talk
TLDR
An unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks, inspired by inquiry-based discovery learning, which improves performance on several benchmarks and competes with models that obtain knowledge from external KBs. Expand
COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs
TLDR
It is proposed that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents, and a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them is proposed. Expand
On the Erdős Discrepancy Problem
TLDR
It is proved that any completely multiplicative sequence of size 127,646 or more has discrepancy at least 4, proving the Erdős discrepancy conjecture for discrepancy up to 3 and providing inductive construction rules as well as streamlining methods to improve the lower bounds for sequences of higher discrepancies. Expand
...
1
2
3
4
5
...