Corpus ID: 53424488

Reward learning from human preferences and demonstrations in Atari

  title={Reward learning from human preferences and demonstrations in Atari},
  author={Borja Ibarz and J. Leike and Tobias Pohlen and Geoffrey Irving and S. Legg and Dario Amodei},
  • Borja Ibarz, J. Leike, +3 authors Dario Amodei
  • Published in NeurIPS 2018
  • Computer Science, Mathematics
  • To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions. [...] Key Method We train a deep neural network to model the reward function and use its predicted reward to train an DQN-based deep reinforcement learning agent on 9 Atari games. Our approach beats the imitation learning baseline in 7 games and achieves strictly superhuman performance on 2 games without using game rewards. Additionally, we investigate the goodness of fit of the reward model…Expand Abstract
    44 Citations
    Leveraging Human Guidance for Deep Reinforcement Learning Tasks
    • 17
    • PDF
    Learning Reward Functions by Integrating Human Demonstrations and Preferences
    • 21
    • PDF
    Understanding Learned Reward Functions
    • PDF
    Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference
    • J. Choi, C. Dance, +4 authors Minsu Kim
    • Computer Science
    • 2020 IEEE International Conference on Robotics and Automation (ICRA)
    • 2020
    • 1
    Inverse Reinforcement Learning from a Gradient-based Learner
    • Highly Influenced
    • PDF
    Intrinsic Reward Driven Imitation Learning via Generative Model
    • 1
    • PDF
    Meta-Reward Model Based on Trajectory Data with k-Nearest Neighbors Method
    • PDF
    Learning Human Objectives by Evaluating Hypothetical Behavior
    • 13
    • PDF


    Deep Reinforcement Learning from Human Preferences
    • 353
    • PDF
    Overcoming Exploration in Reinforcement Learning with Demonstrations
    • 266
    • PDF
    Reinforcement and Imitation Learning for Diverse Visuomotor Skills
    • 154
    • PDF
    Hindsight Experience Replay
    • 714
    • PDF
    Reinforcement Learning with Unsupervised Auxiliary Tasks
    • 688
    • PDF
    Playing Atari with Deep Reinforcement Learning
    • 4,734
    • PDF
    Human-level control through deep reinforcement learning
    • 11,340
    • PDF
    A Survey of Preference-Based Reinforcement Learning Methods
    • 82
    • PDF
    Generative Adversarial Imitation Learning
    • 979
    • Highly Influential
    • PDF
    Diversity is All You Need: Learning Skills without a Reward Function
    • 277
    • PDF