A study of the knowledge base requirements for passing an elementary science test

  title={A study of the knowledge base requirements for passing an elementary science test},
  author={Peter Clark and Philip Harrison and Niranjan Balasubramanian},
  booktitle={AKBC '13},
Our long-term interest is in machines that contain large amounts of general and scientific knowledge, stored in a "computable" form that supports reasoning and explanation. As a medium-term focus for this, our goal is to have the computer pass a fourth-grade science test, anticipating that much of the required knowledge will need to be acquired semi-automatically. This paper presents the first step towards this goal, namely a blueprint of the knowledge requirements for an early science exam… 

Figures from this paper

What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams

This work develops an explanation-based analysis of knowledge and inference requirements, which supports a fine-grained characterization of the challenges, and compares a retrieval and an inference solver on 212 questions.

Reading and Reasoning with Knowledge Graphs

This thesis presents methods for reasoning over very large knowledge bases, and shows how to apply these methods to models of machine reading, which can successfully incorporate knowledge base information into machine learning models of natural language.

A Study of Automatically Acquiring Explanatory Inference Patterns from Corpora of Explanations: Lessons from Elementary Science Exams

The possibility of generating large explanations with an average of six facts by automatically extracting common explanatory patterns from a corpus of manually authored elementary science explanations represented as lexically-connected explanation graphs grounded in a semi-structured knowledge base of tables is explored.

WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-Hop Inference

A corpus of explanations for standardized science exams, a recent challenge task for question answering, is presented and an explanation-centered tablestore is provided, a collection of semi-structured tables that contain the knowledge to construct these elementary science explanations.

Logico-linguistic Semantic Representation of Documents

  • Sharyar WaniM. WahiddinT. Sembok
  • Computer Science
    2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)
  • 2016
The results clearly indicate the effectiveness of the knowledge extraction and representation methodology developed providing intelligence to machines for efficient analysis of data.

A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

This work proposes a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset and demonstrates that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus.

ScienceWorld: Is your Agent Smarter than a 5th Grader?

This paper presents a new benchmark, SCIENCEWORLD, to test agents’ scientific reasoning abilities in a new interactive text environment at the level of a standard elementary school science

What Does My QA Model Know? Devising Controlled Probes Using Expert Knowledge

A methodology for automatically building probe datasets from expert knowledge sources, allowing for systematic control and a comprehensive evaluation, and confirms that transformer-based multiple-choice QA models are already predisposed to recognize certain types of structural linguistic knowledge.

Ranking Facts for Explaining Answers to Elementary Science Questions

Considering automated reasoning for elementary science question answering, this work addresses the novel task of generating explanations for answers from human-authored facts using a practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features.

How to Write Science Questions that Are Easy for People and Hard for Computers

This work argues that hand-constructed multiple-choice tests, with questions that relate the formal science to the realia of laboratory experiments or of real-world observations are likely to be easy for people and hard for AI programs.



Diagrammatic Representation and Inference

It is shown that properties of the mind, rather than the world, should guide diagramming convention, and an influential thesis based on a philosophy of ”real world” ontology makes the opposite prediction.

Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

A novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts is presented.

Extracting and evaluating general world knowledge from the Brown Corpus

Evaluated techniques for extracting general world knowledge from miscellaneous texts by a process of approximate interpretation and abstraction find that nearly 60% of the extracted propositions are favorably judged according to the scheme by any given judge.

Open Information Extraction: The Second Generation

The second generation of Open IE systems are described, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE.

Acquiring Commonsense Knowledge for a Cognitive Agent

  • James Allen
  • Computer Science
    AAAI Fall Symposium: Advances in Cognitive Systems
  • 2011
This work describes a system that learns conceptual knowledge by deep understanding of WordNet glosses, which is viewed as simultaneously accomplishing two goals: building a rich semantic lexicon useful for natural language processing, and building a knowledge base that encodes common-sense knowledge.

Discovery of inference rules for question-answering

This paper presents an unsupervised algorithm for discovering inference rules from text based on an extended version of Harris’ Distributional Hypothesis, which states that words that occurred in the same contexts tend to be similar.

Random Walk Inference and Learning in A Large Scale Knowledge Base

It is shown that a soft inference procedure based on a combination of constrained, weighted, random walks through the knowledge base graph can be used to reliably infer new beliefs for theknowledge base.

MindNet: Acquiring and Structuring Semantic Information from Text

An overview of the distinguishing characteristics of MindNet, the steps involved in its creation, and its extension beyond dictionary text are provided.

Extracting meronyms for a biology knowledge base using distant supervision

A novel algorithm is introduced, generalizing the ``at least one'' assumption of multi-instance learning to handle the case where a fixed (but unknown) percentage of bag members are positive examples, for the domain of college biology.

Types of Common-Sense Knowledge Needed for Recognizing Textual Entailment

This work identifies 20 categories of common-sense knowledge that are prevalent in textual entailment, many of which have received scarce attention from researchers building collections of knowledge.