Social IQA: Commonsense Reasoning about Social Interactions

@inproceedings{Sap2019SocialIC,
  title={Social IQA: Commonsense Reasoning about Social Interactions},
  author={Maarten Sap and Hannah Rashkin and Derek Chen and Ronan Le Bras and Yejin Choi},
  booktitle={EMNLP 2019},
  year={2019}
}
We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. [...] Key Method Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on…Expand
RiddleSense: Answering Riddle Questions as Commonsense Reasoning
TLDR
RIDDLESENSE1 is proposed, a novel multiple-choice question answering challenge for benchmarking higher-order commonsense reasoning models, which is the first large dataset for riddle-style commonsense question answering, where the distractors are crowdsourced from human annotators. Expand
COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences
TLDR
This work introduces a new commonsense reasoning benchmark dataset comprising natural language true/false statements, with each sample paired with its complementary counterpart, resulting in 4k sentence pairs, and proposes a pairwise accuracy metric to reliably measure an agent’s ability to perform Commonsense reasoning over a given situation. Expand
PIQA: Reasoning about Physical Commonsense in Natural Language
TLDR
The task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA are introduced and analysis about the dimensions of knowledge that existing models lack are provided, which offers significant opportunities for future research. Expand
Commonsense-Focused Dialogues for Response Generation: An Empirical Study
Smooth and effective communication requires the ability to perform latent or explicit commonsense inference. Prior commonsense reasoning benchmarks (such as SocialIQA and CommonsenseQA) mainly focusExpand
Go Beyond Plain Fine-Tuning: Improving Pretrained Models for Social Commonsense
TLDR
This study focuses on the Social IQA dataset, a task requiring social and emotional commonsense reasoning, and proposes several architecture variations and extensions, as well as leveraging external commonsense corpora to optimize the model for SocialIQA. Expand
Towards Generative Commonsense Reasoning: A Concept Paper
  • Bill Yuchen Lin
  • 2019
In this concept paper, we first review recent advances in machine commonsense reasoning and further investigate their potential connections with natural language generation. Current research effortsExpand
A Semantic-based Method for Unsupervised Commonsense Question Answering
TLDR
A novel SEmanticbased Question Answering method (SEQA) for unsupervised commonsense question answering that first generates a set of plausible answers with generative models, and then uses these plausible answers to select the correct choice by considering the semantic similarity between each plausible answer and each choice. Expand
Prompting Contrastive Explanations for Commonsense Reasoning Tasks
TLDR
Inspired by the contrastive nature of human explanations, large pretrained language models are used to complete explanation prompts which contrast alternatives according to the key attribute(s) required to justify the correct answer. Expand
Lifelong Knowledge-Enriched Social Event Representation Learning
TLDR
This work investigates methods to incorporate pragmatic aspects into social event embeddings by leveraging social commonsense knowledge and introduces continual learning strategies that allow for incremental consolidation of new knowledge while retaining and promoting efficient usage of prior knowledge. Expand
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
Most benchmark datasets targeting commonsense reasoning focus on everyday scenarios: physical knowledge like knowing that you could fill a cup under a waterfall [Talmor et al., 2019], socialExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 48 REFERENCES
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
TLDR
This work presents CommonsenseQA: a challenging new dataset for commonsense question answering, which extracts from ConceptNet multiple target concepts that have the same semantic relation to a single source concept. Expand
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
TLDR
This paper introduces the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning, and proposes Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data. Expand
Know What You Don’t Know: Unanswerable Questions for SQuAD
TLDR
SQuadRUn is a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. Expand
From Recognition to Cognition: Visual Commonsense Reasoning
TLDR
To move towards cognition-level understanding, a new reasoning engine is presented, Recognition to Cognition Networks (R2C), that models the necessary layered inferences for grounding, contextualization, and reasoning. Expand
ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning
TLDR
Experimental results demonstrate that multitask models that incorporate the hierarchical structure of if-then relation types lead to more accurate inference compared to models trained in isolation, as measured by both automatic and human evaluation. Expand
CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON
TLDR
It is shown how the combined strength and wisdom of the crowds can be used to generate a large, high‐quality, word–emotion and word–polarity association lexicon quickly and inexpensively. Expand
Tackling the Story Ending Biases in The Story Cloze Test
TLDR
A new crowdsourcing scheme is designed that creates a new SCT dataset that overcomes some of the biases and benchmarked a few models on the new dataset, showing that the top-performing model on the original SCT datasets fails to keep up its performance. Expand
Commonsense Causal Reasoning between Short Texts
TLDR
A framework that automatically harvests a network of causal-effect terms from a large web corpus is proposed that outperforms all previously reported results in the standard SE-MEVAL COPA task by substantial margins. Expand
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
HellaSwag: Can a Machine Really Finish Your Sentence?
TLDR
The construction of HellaSwag, a new challenge dataset, and its resulting difficulty, sheds light on the inner workings of deep pretrained models, and suggests a new path forward for NLP research, in which benchmarks co-evolve with the evolving state-of-the-art in an adversarial way, so as to present ever-harder challenges. Expand
...
1
2
3
4
5
...