SQuAD: 100,000+ Questions for Machine Comprehension of Text

  title={SQuAD: 100,000+ Questions for Machine Comprehension of Text},
  author={Pranav Rajpurkar and Jian Zhang and Konstantin Lopyrev and Percy Liang},
  booktitle={Conference on Empirical Methods in Natural Language Processing},
We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. [] Key Result However, human performance (86.8%) is much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at this https URL

Figures and Tables from this paper

Global Span Representation Model for Machine Comprehension on SQuAD

A neural architecture is implemented that globally considers all answer spans in the Recurrent Span Representation and achieves a F1 score of 38 % and an EM score of 30 % with a potential for higher scores with additional hyper parameter tuning.

NewsQA: A Machine Comprehension Dataset

NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment.

JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension

This paper presents the Japanese Question Answering Dataset, JaQuAD, which is annotated by humans, and finetuned a baseline model which achieves 78.92% for F1 score and 63.38% for EM on test set.

Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences

The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills, and finds human solvers to achieve an F1-score of 88.1%.

Improving SQuAD Performance through Model Combination

  • Computer Science
  • 2022
It is found that naive combinations of output layer decrease performance, but when combining the two models prior to the final output component, both F1 and EM increase past either model on its own.

SQuAD Reading Comprehension

Using Stanford Question Answering Dataset (SQuAD), which is a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, this work built a reading comprehension model that attains 75.2% F1 score and 65.0% Exact Match on the test set.

SQuAD Reading Comprehension with Coattention

A model based on the Attentive Reader model is used as a baseline, with elements of a Dynamic Coattention Network applied and co-dependent attention representations that combine the individual representations of the question and context paragraph are implemented.

Coattention Answer-Pointer Networks for Question Answering

An end-to-end neural architecture is proposed for machine comprehension and question answering in natural language understanding that consists of a Dynamic Coattention Network (DCN) encoder and a Match-LSTM decoder.

Implementation and Improvement of Match-LSTM in Question-Answering System

This paper implemented the Machine comprehension system based on the model of SQuAD, and made two extensions on top of that: incorporate the word-level similarity in calculating attention and use bi-attention (question-to-context and context-toquestion) mechanism to infuse the information flow to the decoder.


This paper presents a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network, and shows that scoring explicit span representations significantly improves performance over other approaches that factor the prediction into separate predictions about words or start and end markers.



Machine Comprehension Using Match-LSTM and Answer Pointer

This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences.

Learning Answer-Entailing Structures for Machine Comprehension

A unified max-margin framework is presented that learns to find hidden structures that explain the relation between the question, correct answer, and text, and is extended to incorporate multi-task learning on the different subtasks that are required to perform machine comprehension.

MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text

MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension.

WikiQA: A Challenge Dataset for Open-Domain Question Answering

The WIKIQA dataset is described, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering, which is more than an order of magnitude larger than the previous dataset.

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems.

A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task

A thorough examination of this new reading comprehension task by creating over a million training examples by pairing CNN and Daily Mail news articles with their summarized bullet points, and showing that a neural network can be trained to give good performance on this task.

Modeling Biological Processes for Reading Comprehension

This paper focuses on a new reading comprehension task that requires complex reasoning over a single document, and demonstrates that answering questions via predicted structures substantially improves accuracy over baselines that use shallower representations.

Deep Read: A Reading Comprehension System

Initial work on Deep Read, an automated reading comprehension system that accepts arbitrary text input (a story) and answers questions about it is described, with a baseline system that retrieves the sentence containing the answer 30--40% of the time.

Learning surface text patterns for a Question Answering System

This paper has developed a method for learning an optimal set of surface text patterns automatically from a tagged corpus, and calculates the precision of each pattern, and the average precision for each question type.

Teaching Machines to Read and Comprehend

A new methodology is defined that resolves this bottleneck and provides large scale supervised reading comprehension data that allows a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure to be developed.