Corpus ID: 3922816

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

@article{Clark2018ThinkYH,
  title={Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
  author={Peter Clark and Isaac Cowhey and Oren Etzioni and Tushar Khot and Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord},
  journal={ArXiv},
  year={2018},
  volume={abs/1803.05457}
}
We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering. [...] Key Method The dataset contains only natural, grade-school science questions (authored for human tests), and is the largest public-domain set of this kind (7,787 questions). We test several baselines on the Challenge Set, including leading neural models from the SQuAD and SNLI tasks, and find that none are able to significantly outperform a random baseline, reflecting the difficult…Expand
Answering Science Exam Questions Using Query Rewriting with Background Knowledge
TLDR
A system that rewrites a given question into queries that are used to retrieve supporting text from a large corpus of science-related text is presented and is able to outperform several strong baselines on the ARC dataset. Expand
Answering Science Exam Questions Using Query Reformulation with Background Knowledge
TLDR
This paper presents a system that reformulates a given question into queries that are used to retrieve supporting text from a large corpus of science-related text and outperforms several strong baselines on the ARC dataset. Expand
Getting Closer to AI Complete Question Answering: A Set of Prerequisite Real Tasks
TLDR
QuAIL is presented, the first RC dataset to combine text-based, world knowledge and unanswerable questions, and to provide question type annotation that would enable diagnostics of the reasoning strategies by a given QA system. Expand
Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge
TLDR
The ARC-DA dataset is presented, a direct-answer (“open response”, “freeform”) version of the ARC (AI2 Reasoning Challenge) multiple-choice dataset, one of the first DA datasets of natural questions that often require reasoning, and where appropriate question decompositions are not evident from the questions themselves. Expand
KG^2: Learning to Reason Science Exam Questions with Contextual Knowledge Graph Embeddings
TLDR
This paper proposes a novel framework for answering science exam questions, which mimics human solving process in an open-book exam and outperforms the previous state-of-the-art QA systems. Expand
Reasoning-Driven Question-Answering for Natural Language Understanding
TLDR
This thesis proposes a formulation for abductive reasoning in natural language and shows its effectiveness, especially in domains with limited training data, and presents the first formal framework for multi-step reasoning algorithms, in the presence of a few important properties of language use. Expand
A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset
TLDR
This work proposes a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset and demonstrates that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus. Expand
Exploring ways to incorporate additional knowledge to improve Natural Language Commonsense Question Answering
TLDR
This work identifies external knowledge sources, and shows that the performance further improves when a set of facts retrieved through IR is prepended to each MCQ question during both training and test phase, and presents three different modes of passing knowledge and five different models of using knowledge including the standard BERT MCQ model. Expand
Improving Retrieval-Based Question Answering with Deep Inference Models
TLDR
This proposed two-step model outperforms the best retrieval-based solver by over 3% in absolute accuracy and can answer both simple, factoid questions and more complex questions that require reasoning or inference. Expand
Advances in Automatically Solving the ENEM
TLDR
This work builds on a previous solution that formulated the problem of answering purely textual multiple-choice questions from the ENEM as a text information retrieval problem and investigates how to enhance these methods by text augmentation using Word Embedding and WordNet, a structured lexical database where words are connected according to some relations. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
TLDR
This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems. Expand
Question Answering via Integer Programming over Semi-Structured Knowledge
TLDR
This work proposes a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts. Expand
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
TLDR
It is shown that, in comparison to other recently introduced large-scale datasets, TriviaQA has relatively complex, compositional questions, has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and requires more cross sentence reasoning to find answers. Expand
Question Answering as Global Reasoning Over Semantic Abstractions
TLDR
This work presents the first system that reasons over a wide range of semantic abstractions of the text, which are derived using off-the-shelf, general-purpose, pre-trained natural language modules such as semantic role labelers, coreference resolvers, and dependency parsers. Expand
SciTaiL: A Textual Entailment Dataset from Science Question Answering
TLDR
A new dataset and model for textual entailment, derived from treating multiple-choice question-answering as an entailment problem, is presented, and it is demonstrated that one can improve accuracy on SCITAIL by 5% using a new neural model that exploits linguistic structure. Expand
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Expand
Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
TLDR
This paper evaluates the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam, and shows that the overall system's score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. Expand
Query-Reduction Networks for Question Answering
TLDR
Query-Reduction Network (QRN), a variant of Recurrent Neural Network (RNN) that effectively handles both short-term and long-term sequential dependencies to reason over multiple facts, is proposed. Expand
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
Answering Complex Questions Using Open Information Extraction
TLDR
This work develops a new inference model for Open IE that can work effectively with multiple short facts, noise, and the relational structure of tuples, and significantly outperforms a state-of-the-art structured solver on complex questions of varying difficulty. Expand
...
1
2
3
...