Corpus ID: 52186144

Active Inverse Reward Design

  title={Active Inverse Reward Design},
  author={S. Mindermann and Rohin Shah and Adam Gleave and Dylan Hadfield-Menell},
  • S. Mindermann, Rohin Shah, +1 author Dylan Hadfield-Menell
  • Published 2018
  • Mathematics, Computer Science
  • ArXiv
  • Reward design, the problem of selecting an appropriate reward function for an AI system, is both critically important, as it encodes the task the system should perform, and challenging, as it requires reasoning about and understanding the agent's environment in detail. As a result, system designers often iterate on the reward function in a trial-and-error process to get their desired behavior. We propose structuring this process as a series of reward design queries, where we actively select the… CONTINUE READING
    7 Citations

    Figures and Topics from this paper

    Explore Further: Topics Discussed in This Paper

    Reward-rational (implicit) choice: A unifying formalism for reward learning
    • 11
    • PDF
    Learning Human Objectives by Evaluating Hypothetical Behavior
    • 12
    • Highly Influenced
    • PDF
    Choice Set Misspecification in Reward Inference
    • 1
    • PDF
    Combining reward information from multiple sources
    • 2019
    • PDF
    Learning to Interactively Learn and Assist
    • 9
    • PDF
    Querying to Find a Safe Policy under Uncertain Safety Constraints in Markov Decision Processes
    • 1
    • PDF
    A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
    • 3


    Inverse Reward Design
    • 126
    • PDF
    Active Preference-Based Learning of Reward Functions
    • 113
    • Highly Influential
    • PDF
    A Survey of Preference-Based Reinforcement Learning Methods
    • 82
    • PDF
    Apprenticeship learning via inverse reinforcement learning
    • 2,021
    • PDF
    Active Reward Learning from Critiques
    • Yuchen Cui, S. Niekum
    • Computer Science
    • 2018 IEEE International Conference on Robotics and Automation (ICRA)
    • 2018
    • 24
    • PDF
    Active Reward Learning
    • 71
    • PDF
    Model-Free Preference-Based Reinforcement Learning
    • 49
    • PDF
    Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
    • 85
    • PDF
    Deep Reinforcement Learning from Human Preferences
    • 344
    • PDF
    Bayesian Inverse Reinforcement Learning
    • 486
    • PDF