Corpus ID: 102350686

Reinforced Imitation in Heterogeneous Action Space

  title={Reinforced Imitation in Heterogeneous Action Space},
  author={Konrad Zolna and N. Rostamzadeh and Yoshua Bengio and Sungjin Ahn and Pedro H. O. Pinheiro},
  • Konrad Zolna, N. Rostamzadeh, +2 authors Pedro H. O. Pinheiro
  • Published 2019
  • Computer Science, Mathematics
  • ArXiv
  • Imitation learning is an effective alternative approach to learn a policy when the reward function is sparse. In this paper, we consider a challenging setting where an agent and an expert use different actions from each other. We assume that the agent has access to a sparse reward function and state-only expert observations. We propose a method which gradually balances between the imitation learning cost and the reinforcement learning objective. In addition, this method adapts the agent's… CONTINUE READING
    6 Citations

    Figures, Tables, and Topics from this paper

    Explore Further: Topics Discussed in This Paper

    Positive-Unlabeled Reward Learning
    • 6
    • Highly Influenced
    • PDF
    • Highly Influenced
    • PDF
    Offline Learning from Demonstrations and Unlabeled Experience
    • 1
    • PDF
    Towards intervention-centric causal reasoning in learning agents
    • Highly Influenced
    • PDF
    Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
    • 20
    • PDF


    Internal Model from Observations for Reward Shaping
    • 9
    • PDF
    Generative Adversarial Imitation Learning
    • 943
    • Highly Influential
    • PDF
    Overcoming Exploration in Reinforcement Learning with Demonstrations
    • 255
    • PDF
    Third-Person Imitation Learning
    • 127
    • PDF
    Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
    • 145
    • PDF
    Reinforcement Learning from Imperfect Demonstrations
    • 100
    • PDF
    Apprenticeship learning via inverse reinforcement learning
    • 2,021
    • PDF
    Reinforcement and Imitation Learning for Diverse Visuomotor Skills
    • 148
    • PDF
    Policy Optimization with Demonstrations
    • 47
    • PDF
    Observational Learning by Reinforcement Learning
    • 22
    • PDF