Moving beyond the Turing Test with the Allen AI Science Challenge

@article{Schoenick2017MovingBT,
  title={Moving beyond the Turing Test with the Allen AI Science Challenge},
  author={Carissa Schoenick and Peter Clark and Oyvind Tafjord and Peter D. Turney and Oren Etzioni},
  journal={Communications of the ACM},
  year={2017},
  volume={60},
  pages={60 - 64}
}
Answering questions correctly from standardized eighth-grade science tests is itself a test of machine intelligence. 
A Survey of Question Answering for Math and Science Problem
TLDR
The progress made towards the goal of making a machine smart enough to pass the standardized test is explored, and the challenges and opportunities posed by the domain are seen. Expand
Towards Fluid Machine Intelligence: Can We Make a Gifted AI?
Most applications of machine intelligence have focused on demonstrating crystallized intelligence. Crystallized intelligence relies on accessing problem-specific knowledge, skills and experienceExpand
Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too
TLDR
This paper revisits some of the key questions surrounding the Turing test, such as ‘understanding’, commonsense reasoning and extracting meaning from the world, and explores how the new testing paradigms should work to unmask the limitations of current and future AI. Expand
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
TLDR
A new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering constitute the AI2 Reasoning Challenge (ARC), which requires far more powerful knowledge and reasoning than previous challenges such as SQuAD or SNLI. Expand
WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-Hop Inference
TLDR
A corpus of explanations for standardized science exams, a recent challenge task for question answering, is presented and an explanation-centered tablestore is provided, a collection of semi-structured tables that contain the knowledge to construct these elementary science explanations. Expand
Parsing to Programs: A Framework for Situated QA
TLDR
Parsing to Programs, a framework that combines ideas from parsing and probabilistic programming for situated question answering, is introduced and a system that solves pre-university level Newtonian physics questions is built. Expand
What metrics should we use to measure commercial AI?
TLDR
It is suggested that an AI Cosmology might help to identify a single standard model for AI that could be the foundation for a common shared understanding of what AI is and what it is not. Expand
A Study of Automatically Acquiring Explanatory Inference Patterns from Corpora of Explanations: Lessons from Elementary Science Exams
TLDR
The possibility of generating large explanations with an average of six facts by automatically extracting common explanatory patterns from a corpus of manually authored elementary science explanations represented as lexically-connected explanation graphs grounded in a semi-structured knowledge base of tables is explored. Expand
PIQA: Reasoning about Physical Commonsense in Natural Language
TLDR
The task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA are introduced and analysis about the dimensions of knowledge that existing models lack are provided, which offers significant opportunities for future research. Expand
Apples to Apples: Learning Semantics of Common Entities Through a Novel Comprehension Task
TLDR
A novel machine comprehension task, GuessTwo, given a short paragraph comparing different aspects of two real-world semantically-similar entities, a system should guess what those entities are, indicating that the task is very challenging across the models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 25 REFERENCES
Beyond the Turing Test
The articles in this special issue of AI Magazine include those that propose specific tests, and those that look at the challenges inherent in building robust, valid, and reliable tests for advancingExpand
My Computer Is an Honor Student - but How Intelligent Is It? Standardized Tests as a Measure of AI
TLDR
It is argued that machine performance on standardized tests should be a key component of any new measure of AI, because attaining a high level of performance requires solving significant AI problems involving language understanding and world modeling - critical skills for any machine that lays claim to intelligence. Expand
Natural Language Annotations for Question Answering
This paper presents strategies and lessons learned from the use of natural language annotations to facilitate question answering in the START information access system.
Science Question Answering using Instructional Materials
TLDR
A unified max-margin framework is presented that learns to find hidden structures in a corpus of question-answer pairs and instructional materials, and uses what it learns to answer novel elementary science questions. Expand
Information Extraction over Structured Data: Question Answering with Freebase
TLDR
It is shown that relatively modest information extraction techniques, when paired with a webscale corpus, can outperform these sophisticated approaches by roughly 34% relative gain. Expand
Computing Machinery and Intelligence
  • A. Turing
  • Computer Science
  • The Philosophy of Artificial Intelligence
  • 1990
TLDR
If the meaning of the words “machine” and “think” are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, “Can machines think?” is to be sought in a statistical survey such as a Gallup poll. Expand
Open question answering over curated and extracted knowledge bases
TLDR
This paper presents OQA, the first approach to leverage both curated and extracted KBs, and demonstrates that it achieves up to twice the precision and recall of a state-of-the-art Open QA system. Expand
Computing Machinery and Intelligence.
TLDR
The question, “Can machines think?” is considered, and the question is replaced by another, which is closely related to it and is expressed in relatively unambiguous words. Expand
Semantic Parsing on Freebase from Question-Answer Pairs
TLDR
This paper trains a semantic parser that scales up to Freebase and outperforms their state-of-the-art parser on the dataset of Cai and Yates (2013), despite not having annotated logical forms. Expand
Message Understanding Conference- 6: A Brief History
TLDR
MUC-6 introduced several innovations over prior MUCs, most notably in the range of different tasks for which evaluations were conducted and the motivations for the new format. Expand
...
1
2
3
...