From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

@article{Clark2020FromT,
  title={From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project},
  author={P. Clark and Oren Etzioni and Daniel Khashabi and Tushar Khot and B. D. Mishra and Kyle Richardson and Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord and Niket Tandon and Sumithra Bhakthavatsalam and Dirk Groeneveld and Michal Guerquin and Michael Schmitz},
  journal={AI Mag.},
  year={2020},
  volume={41},
  pages={39-53}
}
AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy, but the rich variety of standardized exams has remained a landmark challenge. Even in 2016, the best AI system achieved merely 59.3% on an 8th Grade science exam challenge. This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam's non-diagram, multiple choice (NDMC) questions. In addition, our Aristo… Expand
21 Citations

Figures and Tables from this paper

Humans Keep It One Hundred: an Overview of AI Journey
  • Highly Influenced
  • PDF
AI Journey 2019: School Tests Solving Competition
What Does My QA Model Know? Devising Controlled Probes Using Expert Knowledge
  • 18
  • PDF
ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention
  • 1
  • Highly Influenced
  • PDF
Finding Old Answers to New Math Questions: The ARQMath Lab at CLEF 2020
  • 3
  • PDF
Autoregressive Reasoning over Chains of Facts with Transformers
  • 2
  • PDF
ParsiNLU: A Suite of Language Understanding Challenges for Persian
  • PDF
...
1
2
3
...

References

SHOWING 1-10 OF 63 REFERENCES
My Computer Is an Honor Student - but How Intelligent Is It? Standardized Tests as a Measure of AI
  • 66
  • PDF
Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
  • 105
  • PDF
Project Halo Update - Progress Toward Digital Aristotle
  • 89
  • PDF
Three open problems in AI
  • 44
  • PDF
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
  • 177
  • PDF
RACE: Large-scale ReAding Comprehension Dataset From Examinations
  • 407
  • PDF
Project Halo: Towards a Digital Aristotle
  • 101
  • PDF
Building Watson: An Overview of the DeepQA Project
  • 1,293
  • PDF
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
  • 136
  • PDF
...
1
2
3
4
5
...