Corpus ID: 45705325

Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge!

@inproceedings{Clark2015ElementarySS,
  title={Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge!},
  author={Peter Clark},
  booktitle={AAAI},
  year={2015}
}
While there has been an explosion of impressive, data-driven AI applications in recent years, machines still largely lack a deeper understanding of the world to answer questions that go beyond information explicitly stated in text, and to explain and discuss those answers. To reach this next generation of AI applications, it is imperative to make faster progress in areas of knowledge, modeling, reasoning, and language. Standardized tests have often been proposed as a driver for such progress… Expand
Towards Literate Artificial Intelligence
Standardized tests are used to test students as they progress in the formal education system. These tests are readily available and have clear evaluation procedures.Hence, it has been proposed thatExpand
Humans Keep It One Hundred: an Overview of AI Journey
TLDR
The results of AI Journey, a competition of AI-systems aimed to improve AI performance on knowledge bases, reasoning and text generation, are described, showing different approaches to task understanding and reasoning. Expand
Common Sense, the Turing Test, and the Quest for Real AI
TLDR
Hector Levesque considers the role of language in learning, and identifies a possible mechanism behind common sense and the capacity to call on background knowledge: the ability to represent objects of thought symbolically. Expand
Project Aristo: Towards Machines that Capture and Reason with Science Knowledge
TLDR
This talk will describe the journey of Aristo through various knowledge capture technologies, including acquiring if/then rules, tables, knowledge graphs, and latent neural representations, and speculate on the larger quest towards knowledgable machines that can reason, explain, and discuss. Expand
Solving Mathematical Puzzles: a Deep Reasoning Challenge (Position Paper)
TLDR
This work proposes a challenge: to design and implement an end-to-end solver for mathematical puzzles able to compete with primary school students, calling for an unprecedented integration of many different AI techniques. Expand
Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
TLDR
This paper evaluates the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam, and shows that the overall system's score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. Expand
Considerations for Evaluating Models of Language Understanding and Reasoning
Efforts to construct tasks for evaluating reasoning systems face a tradeoff between ecological validity and interpretability. That is, as task difficulty and diversity increases, the easier it is toExpand
Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks
TLDR
This work introduces NUMBERGAME, a multifaceted benchmark to evaluate model performance across numerical reasoning tasks of eight diverse formats, and takes forward the recent progress in generic system development, demonstrating the scope of under-explored tasks. Expand
Using Thought-Provoking Children's Questions to Drive Artificial Intelligence Research
We propose to use thought-provoking children's questions (TPCQs), namely Highlights BrainPlay questions, as a new method to drive artificial intelligence research and to evaluate the capabilities ofExpand
Solving Mathematical Puzzles: A Challenging Competition for AI
TLDR
Competitions have been and are currently run on conversational behavior, automatic control, cooperation and coordination in robotics, logic reasoning and knowledge, and natural language, which have brought many insights and advancements on various artificial intelligence fields. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 12 REFERENCES
The Limitations of Standardized Science Tests as Benchmarks for Artificial Intelligence Research: Position Paper
TLDR
This position paper argues that standardized tests for elementary science such as SAT or Regents tests are not very good benchmarks for measuring the progress of artificial intelligence systems in understanding basic science and that more appropriate collections of exam style problems could be assembled. Expand
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Expand
A study of the knowledge base requirements for passing an elementary science test
TLDR
The analysis suggests that as well as fact extraction from text and statistically driven rule extraction, three other styles of automatic knowledge base construction (AKBC) would be useful: acquiring definitional knowledge, direct 'reading' of rules from texts that state them, and, given a particular representational framework, acquisition of specific instances of those models from text. Expand
The Winograd Schema Challenge
TLDR
This paper presents an alternative to the Turing Test that has some conceptual and practical advantages, and English-speaking adults will have no difficulty with it, and the subject is not required to engage in a conversation and fool an interrogator into believing she is dealing with a person. Expand
Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving
TLDR
The Todai Robot Project task focuses on benchmarking NLP systems for problem solving, and the details of the method to manage question resources and their correct answers, answering tools and participation by researchers in the task are described. Expand
Diagram Understanding in Geometry Questions
TLDR
This paper presents a method for diagram understanding that identifies visual elements in a diagram while maximizing agreement between textual and visual data, and shows that the method's objective function is submodular. Expand
Can an AI get into the University of Tokyo
For the thousands of secondary school students who take Japan's university entrance exams each year, test days are longdreaded nightmares of jitters and sweaty palms. But the newest test taker can beExpand
Etzioni, O. Diagram Understanding in Geometry Questions. AAAI
  • Etzioni, O. Diagram Understanding in Geometry Questions. AAAI
  • 2014
The Grade 4 Elementary-Level Science Test
  • The Grade 4 Elementary-Level Science Test
  • 2014
The Limitations of Standardized Science Tests as Benchmarks for AI Research
  • The Limitations of Standardized Science Tests as Benchmarks for AI Research
  • 2014
...
1
2
...