Corpus ID: 222140747

Chess as a Testing Grounds for the Oracle Approach to AI Safety

@article{Miller2021ChessAA,
  title={Chess as a Testing Grounds for the Oracle Approach to AI Safety},
  author={James D. Miller and Roman V Yampolskiy and Olle Haggstrom and Stuart Armstrong},
  journal={ArXiv},
  year={2021},
  volume={abs/2010.02911}
}
To reduce the danger of powerful super-intelligent AIs, we might make the first such AIs oracles that can only send and receive messages. This paper proposes a possibly practical means of using machine learning to create two classes of narrow AI oracles that would provide chess advice: those aligned with the player's interest, and those that want the player to lose and give deceptively bad advice. The player would be uncertain which type of oracle it was interacting with. As the oracles would… Expand

References

SHOWING 1-10 OF 34 REFERENCES
Thinking Inside the Box: Controlling and Using an Oracle AI
TLDR
This paper analyzes and critique various methods of controlling the AI, and suggests that an Oracle AI might be safer than unrestricted AI, but still remains potentially dangerous. Expand
Good and safe uses of AI Oracles
TLDR
Two designs for Oracles are presented which, even under pessimistic assumptions, will not manipulate their users into releasing them and yet will still be incentivised to provide their users with helpful answers. Expand
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
TLDR
This paper generalizes the AlphaZero approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games, and convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go. Expand
AI safety via debate
TLDR
This work proposes training agents via self play on a zero sum debate game, focusing on potential weaknesses as the model scales up, and proposes future human and computer experiments to test these properties. Expand
Behind Deep Blue: Building the Computer that Defeated the World Chess Champion
From the Publisher: On May 11, 1997, as millions worldwide watched a stunning victory unfold on television, a machine shocked the chess world by defeating the defending world champion, GarryExpand
Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess
TLDR
An analytic comparison show that pieces are valued differently between variants, and that some variants are more decisive than classical chess, demonstrate the rich possibilities that lie beyond the rules of modern chess. Expand
Human Compatible: Artificial Intelligence and the Problem of Control
"The most important book I have read in quite some time" (Daniel Kahneman); "A must-read" (Max Tegmark); "The book we've all been waiting for" (Sam Harris) LONGLISTED FOR THE 2019 FINANCIAL TIMES ANDExpand
Artificial Intelligence as a Positive and Negative Factor in Global Risk
By far the greatest danger of Artificial Intelligence is that people conclude too early that they understand it. Of course this problem is not limited to the field of AI. Jacques Monod wrote: "AExpand
Cooperative Inverse Reinforcement Learning
TLDR
It is shown that computing optimal joint policies in CIRL games can be reduced to solving a POMDP, it is proved that optimality in isolation is suboptimal in C IRL, and an approximate CirL algorithm is derived. Expand
Superintelligence: Paths, Dangers, Strategies
The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger musclesExpand
...
1
2
3
4
...