Corpus ID: 210023753

Information Theoretic Model Predictive Q-Learning

@article{Bhardwaj2020InformationTM,
  title={Information Theoretic Model Predictive Q-Learning},
  author={Mohak Bhardwaj and A. Handa and D. Fox and B. Boots},
  journal={ArXiv},
  year={2020},
  volume={abs/2001.02153}
}
  • Mohak Bhardwaj, A. Handa, +1 author B. Boots
  • Published 2020
  • Computer Science, Mathematics
  • ArXiv
  • Model-free Reinforcement Learning (RL) works well when experience can be collected cheaply and model-based RL is effective when system dynamics can be modeled accurately. However, both assumptions can be violated in real world problems such as robotics, where querying the system can be expensive and real-world dynamics can be difficult to model. In contrast to RL, Model Predictive Control (MPC) algorithms use a simulator to optimize a simple policy class online, constructing a closed-loop… CONTINUE READING
    2 Citations
    Blending MPC & Value Function Approximation for Efficient Reinforcement Learning
    • PDF
    Local Search for Policy Iteration in Continuous Control
    • 1
    • PDF

    References

    SHOWING 1-10 OF 43 REFERENCES
    Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
    • 337
    • PDF
    Model-Ensemble Trust-Region Policy Optimization
    • 167
    • PDF
    Information theoretic MPC for model-based reinforcement learning
    • 167
    • PDF
    One-shot learning of manipulation skills with online dynamics adaptation and neural network priors
    • Justin Fu, S. Levine, P. Abbeel
    • Computer Science, Mathematics
    • 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
    • 2016
    • 100
    • PDF
    An Online Learning Approach to Model Predictive Control
    • 20
    • PDF
    MODEL-ENSEMBLE TRUST-REGION POLICY OPTI-
    • 2017
    • 25
    Agnostic System Identification for Model-Based Reinforcement Learning
    • 80
    • PDF
    Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
    • 79
    • PDF
    Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
    • 1,226
    • PDF
    Taming the Noise in Reinforcement Learning via Soft Updates
    • 176
    • Highly Influential
    • PDF