• Corpus ID: 2498076

What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams

  title={What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams},
  author={Peter Alexander Jansen and Niranjan Balasubramanian and Mihai Surdeanu and Peter Clark},
  booktitle={International Conference on Computational Linguistics},
QA systems have been making steady advances in the challenging elementary science exam domain. In this work, we develop an explanation-based analysis of knowledge and inference requirements, which supports a fine-grained characterization of the challenges. In particular, we model the requirements based on appropriate sources of evidence to be used for the QA task. We create requirements by first identifying suitable sentences in a knowledge base that support the correct answer, then use these… 

Figures and Tables from this paper

WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference

A corpus of explanations for standardized science exams, a recent challenge task for question answering, is presented and an explanation-centered tablestore is provided, a collection of semi-structured tables that contain the knowledge to construct these elementary science explanations.

Ranking Facts for Explaining Answers to Elementary Science Questions

Considering automated reasoning for elementary science question answering, this work addresses the novel task of generating explanations for answers from human-authored facts using a practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features.

WorldTree V2: A Corpus of Science-Domain Structured Explanations and Inference Patterns supporting Multi-Hop Inference

This work presents the second iteration of the WorldTree project, a corpus of 5,114 standardized science exam questions paired with large detailed multi-fact explanations that combine core scientific knowledge and world knowledge, and uses this explanation corpus to author a set of 344 high-level science domain inference patterns similar to semantic frames supporting multi-hop inference.

A Study of Automatically Acquiring Explanatory Inference Patterns from Corpora of Explanations: Lessons from Elementary Science Exams

The possibility of generating large explanations with an average of six facts by automatically extracting common explanatory patterns from a corpus of manually authored elementary science explanations represented as lexically-connected explanation graphs grounded in a semi-structured knowledge base of tables is explored.

ExplanationLP: Abductive Reasoning for Explainable Science Question Answering

A novel approach for answering and explaining multiple-choice science questions by reasoning on grounding and abstract inference chains that elicits explanations by constructing a weighted graph of relevant facts for each candidate answer and extracting the facts that satisfy certain structural and semantic constraints.

Explainable Inference Over Grounding-Abstract Chains for Science Questions

This paper frames question answering as a natural language abductive reasoning problem, constructing plausible explanations for each candidate answer and then selecting the candidate with the best explanation as the best answer via Bayesian Optimisation.

TextGraphs 2019 Shared Task on Multi-Hop Inference for Explanation Regeneration

The Shared Task on Multi-Hop Inference for Explanation Regeneration tasks participants with regenerating detailed gold explanations for standardized elementary science exam questions by selecting facts from a knowledge base of semi-structured tables.

QASC: A Dataset for Question Answering via Sentence Composition

This work presents a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question, and provides annotation for supporting facts as well as their composition.

A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

This work proposes a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset and demonstrates that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus.

Multi-hop Inference for Sentence-level TextGraphs: How Challenging is Meaningfully Combining Information for Science Question Answering?

This work empirically characterize the difficulty of building or traversing a graph of sentences connected by lexical overlap, by evaluating chance sentence aggregation quality through 9,784 manually-annotated judgements across knowledge graphs built from three free-text corpora.

Exploring Markov Logic Networks for Question Answering

A system that reasons with knowledge derived from textbooks, represented in a subset of firstorder logic, called Praline, which demonstrates a 15% accuracy boost and a 10x reduction in runtime as compared to other MLNbased methods, and comparable accuracy to word-based baseline approaches.

Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions

This paper describes an alternative approach that operates at three levels of representation and reasoning: information retrieval, corpus statistics, and simple inference over a semi-automatically constructed knowledge base, to achieve substantially improved results.

A study of the knowledge base requirements for passing an elementary science test

The analysis suggests that as well as fact extraction from text and statistically driven rule extraction, three other styles of automatic knowledge base construction (AKBC) would be useful: acquiring definitional knowledge, direct 'reading' of rules from texts that state them, and, given a particular representational framework, acquisition of specific instances of those models from text.

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems.

Question Answering via Integer Programming over Semi-Structured Knowledge

This work proposes a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts.

Learning question classifiers: the role of semantic information

It is shown that, in the context of question classification, augmenting the input of the classifier with appropriate semantic category information results in significant improvements to classification accuracy.

Higher-order Lexical Semantic Models for Non-factoid Answer Reranking

This work introduces a higher-order formalism that allows all these lexical semantic models to chain direct evidence to construct indirect associations between question and answer texts, by casting the task as the traversal of graphs that encode direct term associations.

A library of generic concepts for composing knowledge bases

The component library is described, a hierarchy of reusable, composable, domain-independent knowledge units, influenced heavily by linguistic resources, which emphasizes coverage, access, access and semantics.

My Computer Is an Honor Student - but How Intelligent Is It? Standardized Tests as a Measure of AI

It is argued that machine performance on standardized tests should be a key component of any new measure of AI, because attaining a high level of performance requires solving significant AI problems involving language understanding and world modeling - critical skills for any machine that lays claim to intelligence.

Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

We propose a robust answer reranking model for non-factoid questions that integrates lexical semantics with discourse information, driven by two representations of discourse: a shallow representation