SQuAD: 100,000+ Questions for Machine Comprehension of Text

@inproceedings{Rajpurkar2016SQuAD1Q,
  title={SQuAD: 100,000+ Questions for Machine Comprehension of Text},
  author={Pranav Rajpurkar and Jian Zhang and Konstantin Lopyrev and Percy Liang},
  booktitle={EMNLP},
  year={2016}
}
We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. [...] Key Result However, human performance (86.8%) is much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at this https URLExpand
Global Span Representation Model for Machine Comprehension on SQuAD
Machine comprehension of text is an important problem in natural language processing. A recently released dataset, the Stanford Question Answering Dataset (SQuAD), offers a large number of realExpand
NewsQA: A Machine Comprehension Dataset
TLDR
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. Expand
Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences
TLDR
The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills, and finds human solvers to achieve an F1-score of 88.1%. Expand
SQuAD Reading Comprehension
One important task in Natural Language Understanding is Reading Comprehension. Given a piece of text, we want to be able to answer any relevant questions. Using Stanford Question AnsweringExpand
SQuAD Reading Comprehension with Coattention
Reading comprehension is an important task in NLP, which involves teaching a machine to understand text enough to answer questions. The Stanford Question Answering Dataset (SQuAD) is a datasetExpand
Coattention Answer-Pointer Networks for Question Answering
Machine comprehension (MC) and question answering (QA) are crucial tasks in natural language understanding. Training deep neural network-based QA models has become practical upon the recent releaseExpand
Implementation and Improvement of Match-LSTM in Question-Answering System
In this paper, we tackle the popular machine comprehension task derived from Stanford Question Answering Dataset (SQuAD), which consists of more than 100 thousand questions whose answers are segmentsExpand
EXTRACTIVE QUESTION ANSWERING
The reading comprehension task, that asks questions about a given evidence document, is a central problem in natural language understanding. Recent formulations of this task have typically focused onExpand
Deep Coattention Networks for Reading Comprehension
Machine reading comprehension of text is an important task in Natural Language Processing. A recently released dataset, the Stanford Question Answering Dataset (SQuAD) formulates the problem asExpand
Selecting Paragraphs to Answer Questions for Multi-passage Machine Reading Comprehension
TLDR
A learning to rank framework with an attention-based neural network to select the best-matching paragraph for a specific question to address the problem of question answering style multi-passage Machine Reading Comprehension. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 31 REFERENCES
Machine Comprehension Using Match-LSTM and Answer Pointer
TLDR
This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences. Expand
Learning Answer-Entailing Structures for Machine Comprehension
TLDR
A unified max-margin framework is presented that learns to find hidden structures that explain the relation between the question, correct answer, and text, and is extended to incorporate multi-task learning on the different subtasks that are required to perform machine comprehension. Expand
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Expand
WikiQA: A Challenge Dataset for Open-Domain Question Answering
TLDR
The WIKIQA dataset is described, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering, which is more than an order of magnitude larger than the previous dataset. Expand
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
TLDR
This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems. Expand
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
TLDR
A thorough examination of this new reading comprehension task by creating over a million training examples by pairing CNN and Daily Mail news articles with their summarized bullet points, and showing that a neural network can be trained to give good performance on this task. Expand
Modeling Biological Processes for Reading Comprehension
TLDR
This paper focuses on a new reading comprehension task that requires complex reasoning over a single document, and demonstrates that answering questions via predicted structures substantially improves accuracy over baselines that use shallower representations. Expand
Deep Read: A Reading Comprehension System
TLDR
Initial work on Deep Read, an automated reading comprehension system that accepts arbitrary text input (a story) and answers questions about it is described, with a baseline system that retrieves the sentence containing the answer 30--40% of the time. Expand
Learning surface text patterns for a Question Answering System
TLDR
This paper has developed a method for learning an optimal set of surface text patterns automatically from a tagged corpus, and calculates the precision of each pattern, and the average precision for each question type. Expand
A Machine Learning Approach to Answering Questions for Reading Comprehension Tests
TLDR
To the best of the knowledge, this is the first work that reports that the use of a machine learning approach achieves competitive results on answering questions for reading comprehension tests. Expand
...
1
2
3
4
...