Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering

@inproceedings{Mihaylov2018CanAS,
  title={Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering},
  author={Todor Mihaylov and Peter Clark and Tushar Khot and Ashish Sabharwal},
  booktitle={EMNLP},
  year={2018}
}
We present a new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject. [...] Key Result Our oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts. We leave it as a challenge to solve the retrieval problem in this multi-hop setting and to close the large gap to human performance.Expand
Careful Selection of Knowledge to Solve Open Book Question Answering
TLDR
This paper addresses QA with respect to the OpenBookQA dataset and combines state of the art language models with abductive information retrieval (IR), information gain based re-ranking, passage selection and weighted scoring to achieve 72.0% accuracy. Expand
BiQuAD: Towards QA based on deeper text understanding
TLDR
This work introduces a new dataset called BiQuAD that requires deeper comprehension in order to answer questions in both extractive and deductive fashion and shows that state-of-the-art QA models do not perform well on the challenging long form contexts and reasoning requirements posed by the dataset. Expand
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering
TLDR
This work reviews the latest research trends in OpenQA, with particular attention to systems that incorporate neural MRC techniques, and revisiting the origin and development of Open QA systems. Expand
Deep learning based question answering system in Bengali
TLDR
This work uses state-of-the-art transformer models to train QA system on a synthetic reading comprehension dataset translated from one of the most popular benchmark datasets in English called SQuAD 2.0. Expand
Answering Science Exam Questions Using Query Reformulation with Background Knowledge
TLDR
This paper presents a system that reformulates a given question into queries that are used to retrieve supporting text from a large corpus of science-related text and outperforms several strong baselines on the ARC dataset. Expand
Answering Science Exam Questions Using Query Rewriting with Background Knowledge
TLDR
A system that rewrites a given question into queries that are used to retrieve supporting text from a large corpus of science-related text is presented and is able to outperform several strong baselines on the ARC dataset. Expand
Exploring ways to incorporate additional knowledge to improve Natural Language Commonsense Question Answering
TLDR
This work identifies external knowledge sources, and shows that the performance further improves when a set of facts retrieved through IR is prepended to each MCQ question during both training and test phase, and presents three different modes of passing knowledge and five different models of using knowledge including the standard BERT MCQ model. Expand
Common Sense-Based Reasoning Using External Knowledge for Question Answering
TLDR
A model for predicting the correct answer by searching for information called evidence necessary to answer questions from external knowledge and by using the evidence as a context is proposed, which is an improvement of 4.04% over the baseline model RoBERTa. Expand
ActKnow: Active External Knowledge Infusion Learning for Question Answering in Low Data Regime
  • 2021
Deep learning models have set benchmark results in various Natural Language Processing tasks. However, these models require an enormous amount of training data, which is infeasible in many practicalExpand
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
TLDR
This work presents CommonsenseQA: a challenging new dataset for commonsense question answering, which extracts from ConceptNet multiple target concepts that have the same semantic relation to a single source concept. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 52 REFERENCES
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
TLDR
A new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering constitute the AI2 Reasoning Challenge (ARC), which requires far more powerful knowledge and reasoning than previous challenges such as SQuAD or SNLI. Expand
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
TLDR
A thorough examination of this new reading comprehension task by creating over a million training examples by pairing CNN and Daily Mail news articles with their summarized bullet points, and showing that a neural network can be trained to give good performance on this task. Expand
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
TLDR
It is shown that, in comparison to other recently introduced large-scale datasets, TriviaQA has relatively complex, compositional questions, has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and requires more cross sentence reasoning to find answers. Expand
KG^2: Learning to Reason Science Exam Questions with Contextual Knowledge Graph Embeddings
TLDR
This paper proposes a novel framework for answering science exam questions, which mimics human solving process in an open-book exam and outperforms the previous state-of-the-art QA systems. Expand
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
Reading Wikipedia to Answer Open-Domain Questions
TLDR
This approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs, indicating that both modules are highly competitive with respect to existing counterparts. Expand
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Expand
NewsQA: A Machine Comprehension Dataset
TLDR
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. Expand
Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences
TLDR
The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills, and finds human solvers to achieve an F1-score of 88.1%. Expand
Question Answering via Integer Programming over Semi-Structured Knowledge
TLDR
This work proposes a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts. Expand
...
1
2
3
4
5
...