On Avoiding Power-Seeking by Artificial Intelligence

@article{Turner2022OnAP,
  title={On Avoiding Power-Seeking by Artificial Intelligence},
  author={Alexander Matt Turner},
  journal={ArXiv},
  year={2022},
  volume={abs/2206.11831}
}
We do not know how to align a very intelligent AI agent's behavior with human interests. I investigate whether -- absent a full solution to this AI alignment problem -- we can build smart AI agents which have limited impact on the world, and which do not autonomously seek power. In this thesis, I introduce the attainable utility preservation (AUP) method. I demonstrate that AUP produces conservative, option-preserving behavior within toy gridworlds and within complex environments based off of…