The NarrativeQA Reading Comprehension Challenge

@article{Kocisk2018TheNR,
  title={The NarrativeQA Reading Comprehension Challenge},
  author={Tom{\'a}s Kocisk{\'y} and Jonathan Schwarz and P. Blunsom and Chris Dyer and K. Hermann and G{\'a}bor Melis and Edward Grefenstette},
  journal={Transactions of the Association for Computational Linguistics},
  year={2018},
  volume={6},
  pages={317-328}
}
Reading comprehension (RC)—in contrast to information retrieval—requires integrating information and reasoning about events, entities, and their relations across a full document. [...] Key Method These tasks are designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard RC models struggle on the tasks presented here. We provide an analysis of the…Expand
Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
TLDR
This paper introduces Cosmos QA, a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions, and proposes a new architecture that improves over the competitive baselines. Expand
What Makes Reading Comprehension Questions Easier?
TLDR
This study proposes to employ simple heuristics to split each dataset into easy and hard subsets and examines the performance of two baseline models for each of the subsets, and observes that the baseline performances for thehard subsets remarkably degrade compared to those of entire datasets. Expand
TellMeWhy: A Dataset for Answering Why-Questions in Narratives
TLDR
This work introduces TellMeWhy, a new crowd-sourced dataset that consists of more than 30k questions and free-form answers concerning why characters in short narratives perform the actions described, and shows that state-of-the-art models are far below human performance on answering such questions. Expand
On Making Reading Comprehension More Comprehensive
TLDR
This work justifies a question answering approach to reading comprehension and describes the various kinds of questions one might use to more fully test a system’s comprehension of a passage, moving beyond questions that only probe local predicate-argument structures. Expand
RC-QED: Evaluating Natural Language Derivations in Multi-Hop Reading Comprehension
TLDR
RC-QED is introduced, a new RC task that requires giving not only the correct answer to a question, but also the reasoning employed for arriving at this answer, and is robust to simple heuristics and challenging for state-of-the-art neural path ranking approaches. Expand
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
TLDR
A new reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs, and presents a new model that combines reading comprehension methods with simple numerical reasoning to achieve 51% F1. Expand
BookQA: Stories of Challenges and Opportunities
We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a memory network to reason and predictExpand
DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension
TLDR
Experimental results on the DREAM data set show the effectiveness of dialogue structure and general world knowledge, the first dialogue-based multiple-choice reading comprehension data set to focus on in-depth multi-turn multi-party dialogue understanding. Expand
Multi-style Generative Reading Comprehension
This study tackles generative reading comprehension (RC), which consists of answering questions based on textual evidence and natural language generation (NLG). We propose a multi-style abstractiveExpand
Getting Closer to AI Complete Question Answering: A Set of Prerequisite Real Tasks
TLDR
QuAIL is presented, the first RC dataset to combine text-based, world knowledge and unanswerable questions, and to provide question type annotation that would enable diagnostics of the reasoning strategies by a given QA system. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 33 REFERENCES
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Expand
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
TLDR
This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems. Expand
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
TLDR
A thorough examination of this new reading comprehension task by creating over a million training examples by pairing CNN and Daily Mail news articles with their summarized bullet points, and showing that a neural network can be trained to give good performance on this task. Expand
Gated Self-Matching Networks for Reading Comprehension and Question Answering
TLDR
The gated self-matching networks for reading comprehension style question answering, which aims to answer questions from a given passage, are presented and holds the first place on the SQuAD leaderboard for both single and ensemble model. Expand
NewsQA: A Machine Comprehension Dataset
TLDR
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. Expand
MovieQA: Understanding Stories in Movies through Question-Answering
TLDR
The MovieQA dataset, which aims to evaluate automatic story comprehension from both video and text, is introduced and existing QA techniques are extended to show that question-answering with such open-ended semantics is hard. Expand
Machine Comprehension Using Match-LSTM and Answer Pointer
TLDR
This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences. Expand
Text Understanding with the Attention Sum Reader Network
TLDR
A new, simple model is presented that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models, making the model particularly suitable for question-answering problems where the answer is a single word from the document. Expand
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
TLDR
There is a sweet-spot, not too big and not too small, between single words and full sentences that allows the most meaningful information in a text to be effectively retained and recalled, and models which store explicit representations of long-term contexts outperform state-of-the-art neural language models at predicting semantic content words. Expand
...
1
2
3
4
...