Corpus ID: 227261694

Optimal Policies Tend to Seek Power.

  title={Optimal Policies Tend to Seek Power.},
  author={A. Turner and L. Smith and Rohin Shah and Andrew Critch and P. Tadepalli},
  journal={arXiv: Artificial Intelligence},
  • A. Turner, L. Smith, +2 authors P. Tadepalli
  • Published 2020
  • Computer Science
  • arXiv: Artificial Intelligence
  • Some researchers have speculated that capable reinforcement learning agents are often incentivized to seek resources and power in pursuit of their objectives. While seeking power in order to optimize a misspecified objective, agents might be incentivized to behave in undesirable ways, including rationally preventing deactivation and correction. Others have voiced skepticism: human power-seeking instincts seem idiosyncratic, and these urges need not be present in reinforcement learning agents… CONTINUE READING


    Avoiding Side Effects in Complex Environments
    • 4
    • PDF
    Conservative Agency via Attainable Utility Preservation
    • 14
    • PDF
    Incorrigibility in the CIRL Framework
    • 11
    • PDF
    Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies
    • 50
    • PDF
    The Off-Switch Game
    • 52
    • PDF
    Universal Value Function Approximators
    • 458
    • PDF
    Structure in the Space of Value Functions
    • 68
    • PDF
    Formalizing Convergent Instrumental Goals
    • 8
    • Highly Influential
    • PDF
    Reinforcement Learning: An Introduction
    • 27,485
    • Highly Influential
    • PDF
    Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning
    • 232
    • PDF