A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories

  title={A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories},
  author={N. Mostafazadeh and Nathanael Chambers and Xiaodong He and Devi Parikh and Dhruv Batra and Lucy Vanderwende and Pushmeet Kohli and James F. Allen},
Representation and learning of commonsense knowledge is one of the foundational problems in the quest to enable deep language understanding. [] Key Method We created a new corpus of 50k five-sentence commonsense stories, ROCStories, to enable this evaluation. This corpus is unique in two ways: (1) it captures a rich set of causal and temporal commonsense relations between daily events, and (2) it is a high quality collection of everyday life stories that can also be used for story generation. Experimental…

Figures and Tables from this paper

CIS2: A Simplified Commonsense Inference Evaluation for Story Prose
A system that forces the model to focus on CCI directly by provid-ing it the original text of the story to use for understanding while having it generate only the bare minimum: indices to sentences, which achieves a 4.3% higher CCI accuracy than those trained for generating full phrases.
Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian
Story comprehension that involves complex causal and temporal relations is a critical task in NLP, but previous studies have focused predominantly on English, leaving open the question of how the
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation
A knowledge-enhanced pretraining model to utilize commonsense knowledge from external knowledge bases to generate reasonable stories that can generate more reasonable stories than state-of-the-art baselines, particularly in terms of logic and global coherence.
Story Comprehension for Predicting What Happens Next
This paper presents a story comprehension model that explores three distinct semantic aspects: the sequence of events described in the story, its emotional trajectory, and its plot consistency, and uses a hidden variable to weigh the semantic aspects in the context of the story.
Find a Reasonable Ending for Stories: Does Logic Relation Help the Story Cloze Test?
This paper incorporates the logic information with the help of the Natural Language Inference task to the Story Cloze Test (SCT) to improve the understanding of the whole story.
STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation
A dataset and evaluation platform built from STORIUM, an online collaborative storytelling community that contains 6K lengthy stories with fine-grained natural language annotations interspersed throughout each narrative, forming a robust source for guiding models.
Tackling the Story Ending Biases in The Story Cloze Test
A new crowdsourcing scheme is designed that creates a new SCT dataset that overcomes some of the biases and benchmarked a few models on the new dataset, showing that the top-performing model on the original SCT datasets fails to keep up its performance.
TellMeWhy: A Dataset for Answering Why-Questions in Narratives
This work introduces TellMeWhy, a new crowd-sourced dataset that consists of more than 30k questions and free-form answers concerning why characters in short narratives perform the actions described, and shows that state-of-the-art models are far below human performance on answering such questions.
CoCoLM: Complex Commonsense Enhanced Language Model with Discourse Relations
This paper proposes a general language model named CoCoLM, which through the careful training over a large-scale eventuality knowledge graph ASER, successfully teaches pre-trained language models (i.e., BERT and RoBERTa) rich multi-hop commonsense knowledge among eventualities.
Commonsense Knowledge in Word Associations and ConceptNet
An in-depth comparison of two large-scale resources of general knowledge: ConceptNet, an engineered relational database, and SWOW, a knowledge graph derived from crowd-sourced word associations shows empirically that both resources improve downstream task performance on commonsense reasoning benchmarks over text-only baselines.


MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension.
Understanding script-based stories using commonsense reasoning
  • E. Mueller
  • Computer Science
    Cognitive Systems Research
  • 2004
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems.
Unsupervised Learning of Narrative Event Chains
A three step process to learning narrative event chains using unsupervised distributional methods to learn narrative relations between events sharing coreferring arguments and introduces two evaluations: the narrative cloze to evaluate event relatedness, and an order coherence task to evaluate narrative order.
Episodic Logic Meets Little Red Riding Hood: A Comprehensive, Natural Representation for Language Un
A comprehensive framework for narrative understanding based on Episodic Logic (EL), developed and implemented as a semantic representation and commonsense knowledge representation that would serve the full range of interpretive and inferential needs of general NLU.
Learning to Tell Tales: A Data-driven Approach to Story Generation
This paper creates an end-to-end system that realizes the various components of the generation pipeline stochastically and follows a generate- and-and-rank approach where the space of multiple candidate stories is pruned by considering whether they are plausible, interesting, and coherent.
Searching for Storiness: Story-Generation from a Reader's Perspective
An approach to automatic story-generation based on an intuitive model of the cognitive states and processes within the mind of an imagined reader of the story is described.
CaTeRS: Causal and Temporal Relation Scheme for Semantic Annotation of Event Structures
A novel semantic annotation framework, called Causal and Temporal Relation Scheme (CaTeRS), which is unique in simultaneously capturing a comprehensive set of temporal and causal relations between events.
Identifying Personal Stories in Millions of Weblog Entries
Efforts to develop a standard corpus for researchers in this area by identifying personal stories in the tens of millions of blog posts in the ICWSM 2009 Spinn3r Dataset are described.
Generating Coherent Event Schemas at Scale
This work presents a novel approach to inducing open-domain event schemas that overcomes limitations of Chambers and Jurafsky's (2009) schemas and uses cooccurrence statistics of semantically typed relational triples, which it calls Rel-grams (relational n- grams).