• Corpus ID: 45705325

Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge!

@inproceedings{Clark2015ElementarySS,
  title={Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge!},
  author={Peter Clark},
  booktitle={AAAI},
  year={2015}
}
  • Peter Clark
  • Published in AAAI 25 January 2015
  • Computer Science
While there has been an explosion of impressive, data-driven AI applications in recent years, machines still largely lack a deeper understanding of the world to answer questions that go beyond information explicitly stated in text, and to explain and discuss those answers. To reach this next generation of AI applications, it is imperative to make faster progress in areas of knowledge, modeling, reasoning, and language. Standardized tests have often been proposed as a driver for such progress… 
Towards Literate Artificial Intelligence
TLDR
A unified max-margin framework that learns to find hidden structures given a corpus of question-answer pairs, and uses what it learns to answer questions on novel texts to obtain state-of-the-art performance on two well-known natural language comprehension benchmarks.
Humans Keep It One Hundred: an Overview of AI Journey
TLDR
The results of AI Journey, a competition of AI-systems aimed to improve AI performance on knowledge bases, reasoning and text generation, are described, showing different approaches to task understanding and reasoning.
Common Sense, the Turing Test, and the Quest for Real AI
TLDR
Hector Levesque considers the role of language in learning, and identifies a possible mechanism behind common sense and the capacity to call on background knowledge: the ability to represent objects of thought symbolically.
Project Aristo: Towards Machines that Capture and Reason with Science Knowledge
TLDR
This talk will describe the journey of Aristo through various knowledge capture technologies, including acquiring if/then rules, tables, knowledge graphs, and latent neural representations, and speculate on the larger quest towards knowledgable machines that can reason, explain, and discuss.
Solving Mathematical Puzzles: a Deep Reasoning Challenge (Position Paper)
TLDR
This work proposes a challenge: to design and implement an end-to-end solver for mathematical puzzles able to compete with primary school students, calling for an unprecedented integration of many different AI techniques.
Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
TLDR
This paper evaluates the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam, and shows that the overall system's score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work.
Considerations for Evaluating Models of Language Understanding and Reasoning
TLDR
A complementary task framework and evaluation dataset modeled closely on [1] is presented, which arguably preserves experimental control and allows for difficulty to be scaled up incrementally while also ensuring that all information relevant to solving the tasks is preserved in the training data.
Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks
TLDR
This work introduces NUMBERGAME, a multifaceted benchmark to evaluate model performance across numerical reasoning tasks of eight diverse formats, and takes forward the recent progress in generic system development, demonstrating the scope of under-explored tasks.
Using Thought-Provoking Children's Questions to Drive Artificial Intelligence Research
We propose to use thought-provoking children's questions (TPCQs), namely Highlights BrainPlay questions, as a new method to drive artificial intelligence research and to evaluate the capabilities of
Solving Mathematical Puzzles: A Challenging Competition for AI
TLDR
Competitions have been and are currently run on conversational behavior, automatic control, cooperation and coordination in robotics, logic reasoning and knowledge, and natural language, which have brought many insights and advancements on various artificial intelligence fields.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 12 REFERENCES
The Limitations of Standardized Science Tests as Benchmarks for Artificial Intelligence Research: Position Paper
TLDR
This position paper argues that standardized tests for elementary science such as SAT or Regents tests are not very good benchmarks for measuring the progress of artificial intelligence systems in understanding basic science and that more appropriate collections of exam style problems could be assembled.
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
TLDR
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension.
A study of the knowledge base requirements for passing an elementary science test
TLDR
The analysis suggests that as well as fact extraction from text and statistically driven rule extraction, three other styles of automatic knowledge base construction (AKBC) would be useful: acquiring definitional knowledge, direct 'reading' of rules from texts that state them, and, given a particular representational framework, acquisition of specific instances of those models from text.
The Winograd Schema Challenge
TLDR
This paper presents an alternative to the Turing Test that has some conceptual and practical advantages, and English-speaking adults will have no difficulty with it, and the subject is not required to engage in a conversation and fool an interrogator into believing she is dealing with a person.
Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving
TLDR
The Todai Robot Project task focuses on benchmarking NLP systems for problem solving, and the details of the method to manage question resources and their correct answers, answering tools and participation by researchers in the task are described.
Diagram Understanding in Geometry Questions
TLDR
This paper presents a method for diagram understanding that identifies visual elements in a diagram while maximizing agreement between textual and visual data, and shows that the method's objective function is submodular.
Can an AI get into the University of Tokyo
For the thousands of secondary school students who take Japan's university entrance exams each year, test days are longdreaded nightmares of jitters and sweaty palms. But the newest test taker can be
Etzioni, O. Diagram Understanding in Geometry Questions. AAAI
  • Etzioni, O. Diagram Understanding in Geometry Questions. AAAI
  • 2014
The Grade 4 Elementary-Level Science Test
  • The Grade 4 Elementary-Level Science Test
  • 2014
The Limitations of Standardized Science Tests as Benchmarks for AI Research
  • The Limitations of Standardized Science Tests as Benchmarks for AI Research
  • 2014
...
1
2
...