Reward learning from human preferences and demonstrations in Atari
@inproceedings{Ibarz2018RewardLF, title={Reward learning from human preferences and demonstrations in Atari}, author={Borja Ibarz and J. Leike and Tobias Pohlen and Geoffrey Irving and S. Legg and Dario Amodei}, booktitle={NeurIPS}, year={2018} }
To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions. [...] Key Method We train a deep neural network to model the reward function and use its predicted reward to train an DQN-based deep reinforcement learning agent on 9 Atari games. Our approach beats the imitation learning baseline in 7 games and achieves strictly superhuman performance on 2 games without using game rewards. Additionally, we investigate the goodness of fit of the reward model…Expand Abstract
Supplemental Code
Github Repo
Via Papers with Code
Code for the paper, "Learning Human Objectives by Evaluating Hypothetical Behavior"
Paper Mentions
44 Citations
Active Preference-Based Gaussian Process Regression for Reward Learning
- Computer Science
- RSS 2020
- 2020
- 9
- PDF
Learning Reward Functions by Integrating Human Demonstrations and Preferences
- Computer Science
- Robotics: Science and Systems
- 2019
- 21
- PDF
Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference
- Computer Science
- 2020 IEEE International Conference on Robotics and Automation (ICRA)
- 2020
- 1
Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments
- Computer Science
- ArXiv
- 2020
- PDF
Inverse Reinforcement Learning from a Gradient-based Learner
- Computer Science, Mathematics
- NeurIPS
- 2020
- Highly Influenced
- PDF
Intrinsic Reward Driven Imitation Learning via Generative Model
- Computer Science, Mathematics
- ICML
- 2020
- 1
- PDF
Meta-Reward Model Based on Trajectory Data with k-Nearest Neighbors Method
- Computer Science
- 2020 International Joint Conference on Neural Networks (IJCNN)
- 2020
- PDF
Learning Human Objectives by Evaluating Hypothetical Behavior
- Computer Science, Mathematics
- ICML
- 2020
- 13
- PDF
References
SHOWING 1-10 OF 60 REFERENCES
Overcoming Exploration in Reinforcement Learning with Demonstrations
- Computer Science, Mathematics
- 2018 IEEE International Conference on Robotics and Automation (ICRA)
- 2018
- 266
- PDF
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
- Computer Science, Engineering
- Robotics: Science and Systems
- 2018
- 154
- PDF
Human-level control through deep reinforcement learning
- Computer Science, Medicine
- Nature
- 2015
- 11,340
- PDF
A Survey of Preference-Based Reinforcement Learning Methods
- Computer Science
- J. Mach. Learn. Res.
- 2017
- 82
- PDF
Generative Adversarial Imitation Learning
- Computer Science, Mathematics
- NIPS
- 2016
- 979
- Highly Influential
- PDF