TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

@inproceedings{Joshi2017TriviaQAAL,
  title={TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension},
  author={Mandar Joshi and Eunsol Choi and Daniel S. Weld and Luke Zettlemoyer},
  booktitle={ACL},
  year={2017}
}
We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. [...] Key Method We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging testbed that is worth significant future study. Data and code available at -- this http URLExpand
NLQuAD: A Non-Factoid Long Question Answering Data Set
TLDR
NLQuAD’s samples exceed the input limitation of most pre-trained Transformer-based models, encouraging future research on long sequence language models and shows that Longformer outperforms the other architectures, but results are still far behind a human upper bound. Expand
Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences
TLDR
The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills, and finds human solvers to achieve an F1-score of 88.1%. Expand
R3: Reinforced Reader-Ranker for Open-Domain Question Answering
TLDR
A new pipeline for open-domain QA with a Ranker component, which learns to rank retrieved passages in terms of likelihood of generating the ground-truth answer to a given question, and a novel method that jointly trains the Ranker along with an answer-generation Reader model, based on reinforcement learning. Expand
R3: Reinforced Ranker-Reader for Open-Domain Question Answering
TLDR
This paper proposes a new pipeline for open-domain QA with a Ranker component, which learns to rank retrieved passages in terms of likelihood of extracting the ground-truth answer to a given question, and proposes a novel method that jointly trains the Ranker along with an answer-extraction Reader model, based on reinforcement learning. Expand
Quizbowl: The Case for Incremental Question Answering
TLDR
This work makes two key contributions to machine learning research through Quizbowl: collecting and curating a large factoid QA dataset and an accompanying gameplay dataset, and developing a computational approach to playing Quiz Bowl that involves determining both what to answer and when to answer. Expand
Towards Robust Neural Machine Reading Comprehension via Question Paraphrases
  • Y. Li, Hongyu Li, Jing Liu
  • Computer Science
  • 2019 International Conference on Asian Language Processing (IALP)
  • 2019
TLDR
A regularized BERT-based model is proposed to encourage the model give the same predictions to similar inputs by leveraging high-quality question paraphrases to address the over-sensitivity issue of neural machine reading comprehension (MRC) models. Expand
BiQuAD: Towards QA based on deeper text understanding
TLDR
This work introduces a new dataset called BiQuAD that requires deeper comprehension in order to answer questions in both extractive and deductive fashion and shows that state-of-the-art QA models do not perform well on the challenging long form contexts and reasoning requirements posed by the dataset. Expand
Getting Closer to AI Complete Question Answering: A Set of Prerequisite Real Tasks
TLDR
QuAIL is presented, the first RC dataset to combine text-based, world knowledge and unanswerable questions, and to provide question type annotation that would enable diagnostics of the reasoning strategies by a given QA system. Expand
Dual Ask-Answer Network for Machine Reading Comprehension
TLDR
A novel two-way neural sequence transduction model that connects three modalities, allowing it to learn two tasks simultaneously and mutually benefit one another, and outperforms the mono-learning counterpart as well as the state-of-the-art joint models on both question answering and question generation tasks. Expand
Relation Module for Non-answerable Prediction on Question Answering
TLDR
This paper aims to improve a MRC model's ability to determine whether a question has an answer in a given context (e.g. the recently proposed SQuAD 2.0 task) by developing a relation module that is adaptable to any MRC models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
NewsQA: A Machine Comprehension Dataset
TLDR
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. Expand
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
TLDR
A thorough examination of this new reading comprehension task by creating over a million training examples by pairing CNN and Daily Mail news articles with their summarized bullet points, and showing that a neural network can be trained to give good performance on this task. Expand
Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering
TLDR
This work casts neural QA as a sequence labeling problem and proposes an end-to-end sequence labeling model, which overcomes all the above challenges and outperforms the baselines significantly on WebQA. Expand
SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine
TLDR
It is shown that there is a meaningful gap between the human and machine performances, which suggests that the proposed dataset could well serve as a benchmark for question-answering. Expand
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Expand
Large-scale Simple Question Answering with Memory Networks
TLDR
This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be difficult in large-scale conditions. Expand
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
TLDR
This new dataset is aimed to overcome a number of well-known weaknesses of previous publicly available datasets for the same task of reading comprehension and question answering, and is the most comprehensive real-world dataset of its kind in both quantity and quality. Expand
WikiQA: A Challenge Dataset for Open-Domain Question Answering
TLDR
The WIKIQA dataset is described, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering, which is more than an order of magnitude larger than the previous dataset. Expand
A Neural Network for Factoid Question Answering over Paragraphs
TLDR
This work introduces a recursive neural network model, qanta, that can reason over question text input by modeling textual compositionality and applies it to a dataset of questions from a trivia competition called quiz bowl. Expand
...
1
2
3
4
...