The Off-Switch Game

@article{HadfieldMenell2016TheOG,
  title={The Off-Switch Game},
  author={Dylan Hadfield-Menell and Anca D. Dragan and Pieter Abbeel and Stuart J. Russell},
  journal={ArXiv},
  year={2016},
  volume={abs/1611.08219}
}
It is clear that one of the primary tools we can use to mitigate the potential risk from a misbehaving AI system is the ability to turn the system off. As the capabilities of AI systems improve, it is important to ensure that such systems do not adopt subgoals that prevent a human from switching them off. This is a challenge because many formulations of rational agents create strong incentives for self-preservation. This is not caused by a built-in instinct, but because a rational agent will… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 20 CITATIONS

Towards Robust End-to-End Alignment

  • SafeAI@AAAI
  • 2019
VIEW 2 EXCERPTS
CITES BACKGROUND

References

Publications referenced by this paper.
SHOWING 1-10 OF 14 REFERENCES

The Basic AI Drives

VIEW 6 EXCERPTS
HIGHLY INFLUENTIAL

The off-switch game

D. Hadfield-Menell, A. D. Dragan, P. Abbeel, S. J. Russell
  • CoRR abs/1611.08219.
  • 2016
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Corrigibility

N. Soares, B. Fallenstein, S. Armstrong, E. Yudkowsky
  • Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.
  • 2015
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Corrigibility . In Workshops at the Twenty - Ninth AAAI Conference on Artificial Intelligence . Tirole , J . 2009 . Cognition and incomplete contracts

N. Soares, B. Fallenstein, S. Armstrong, E. Yudkowsky
  • Scien - tific American
  • 2015
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Here’s what Facebook’s artificial intelligence expert thinks about the future

G. M. Del Prado
  • Tech Insider 9/23/15.
  • 2015
VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

Incentives in Organizations

VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

Should We Fear Supersmart Robots?

  • Scientific American
  • 2016
VIEW 1 EXCERPT

Similar Papers