Acceleration of Gradient-Based Path Integral Method for Efficient Optimal and Inverse Optimal Control
@article{Okada2017AccelerationOG, title={Acceleration of Gradient-Based Path Integral Method for Efficient Optimal and Inverse Optimal Control}, author={Masashi Okada and Tadahiro Taniguchi}, journal={2018 IEEE International Conference on Robotics and Automation (ICRA)}, year={2017}, pages={3013-3020} }
This paper deals with a new accelerated path integral method, which iteratively searches optimal controls with a small number of iterations. This study is based on the recent observations that a path integral method for reinforcement learning can be interpreted as gradient descent. This observation also applies to an iterative path integral method for optimal control, which sets a convincing argument for utilizing various optimization methods for gradient descent, such as momentum-based…
Figures and Tables from this paper
13 Citations
Model Predictive Optimized Path Integral Strategies
- Computer ScienceArXiv
- 2022
The derivation of model predictive path integral control is generalized to allow for a single joint distribution across controls in the control sequence, allowing for the implementation of adaptive importance sampling algorithms into the original importance sampling step while still maintaining the benefits of MPPI.
Real-time Sampling-based Model Predictive Control based on Reverse Kullback-Leibler Divergence and Its Adaptive Acceleration
- Computer ScienceArXiv
- 2022
A novel derivation from reverse Kullback-Leibler divergence is presented, which has a mode-seeking behavior and is likely to find one of the sub-optimal solutions early, and a weighted maximum likelihood estimation with positive/negative weights is obtained, solving by mirror descent (MD) algorithm.
Constrained stochastic optimal control with learned importance sampling: A path integral approach
- Computer ScienceInt. J. Robotics Res.
- 2022
This work proposes an algorithm capable of controlling a wide range of high-dimensional robotic systems in such challenging scenarios based on the path integral formulation of stochastic optimal control, which is extended with constraint-handling capabilities.
An Online Learning Approach to Model Predictive Control
- Computer ScienceRobotics: Science and Systems
- 2019
This paper proposes a new algorithm based on dynamic mirror descent (DMD), an online learning algorithm that is designed for non-stationary setups and provides a fresh perspective on previous heuristics used in MPC and suggests a principled way to design new MPC algorithms.
Learning to Optimize in Model Predictive Control
- Computer Science2022 International Conference on Robotics and Automation (ICRA)
- 2022
This work focuses on learning to optimize more effectively within Model Predictive Control to improve the update rule within MPC, and demonstrates that this approach can outperform a MPC controller with the same number of samples.
Model Predictive Path Integral Control Framework for Partially Observable Navigation: A Quadrotor Case Study
- Computer Science2020 16th International Conference on Control, Automation, Robotics and Vision (ICARCV)
- 2020
A generic MPPI control framework that can be used for 2D or 3D autonomous navigation tasks in either fully or partially observable environments, which are the most prevalent in robotics applications is proposed.
Variational Inference MPC for Bayesian Model-based Reinforcement Learning
- Computer ScienceCoRL
- 2019
A variational inference MPC is introduced, which reformulates various stochastic methods, including CEM, in a Bayesian fashion, and a novel instance of the framework, called probabilistic action ensembles with trajectory sampling (PaETS), which can involve multimodal uncertainties both in dynamics and optimal trajectories.
Learning Sampling Distributions for Model Predictive Control
- Computer ScienceArXiv
- 2022
This work frames the learning problem as bi-level optimization and shows how to train the controller with backpropagation-through-time and uses a normalizing parameterization of the distribution to leverage its tractable density to avoid requiring differentiability of the dynamics and cost function.
Control as Hybrid Inference
- Computer ScienceArXiv
- 2020
This work presents an implementation of CHI which naturally mediates the balance between iterative and amortised inference, and provides a principled framework for harnessing the sample efficiency of model-based planning while retaining the asymptotic performance ofmodel-free policy optimisation.
Reinforcement Learning as Iterative and Amortised Inference
- Computer ScienceArXiv
- 2020
It is demonstrated that a wide range of algorithms can be classified in this manner providing a fresh perspective and highlighting a range of existing similarities, and being able to identify parts of the algorithmic design space which have been relatively unexplored suggesting new routes to innovative RL algorithms.
References
SHOWING 1-10 OF 22 REFERENCES
A Generalized Path Integral Control Approach to Reinforcement Learning
- Computer ScienceJ. Mach. Learn. Res.
- 2010
The framework of stochastic optimal control with path integrals is used to derive a novel approach to RL with parameterized policies to demonstrate interesting similarities with previous RL research in the framework of probability matching and provides intuition why the slightly heuristically motivated probability matching approach can actually perform well.
Path Integral Networks: End-to-End Differentiable Optimal Control
- Computer ScienceArXiv
- 2017
Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem.
Model Predictive Path Integral Control: From Theory to Parallel Computation
- Engineering
- 2017
The current simulations illustrate the efficiency and robustness of the proposed approach and demonstrate the advantages of computational frameworks that incorporate concepts from statistical physics, control theory, and parallelization against more traditional approaches of optimal control theory.
Aggressive driving with model predictive path integral control
- Computer Science2016 IEEE International Conference on Robotics and Automation (ICRA)
- 2016
A model predictive control algorithm designed for optimizing non-linear systems subject to complex cost criteria using a stochastic optimal control framework using a fundamental relationship between the information theoretic notions of free energy and relative entropy is presented.
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
- Computer ScienceJ. Mach. Learn. Res.
- 2011
This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.
A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems
- MathematicsProceedings of the 2005, American Control Conference, 2005.
- 2005
We present an iterative linear-quadratic-Gaussian method for locally-optimal feedback control of nonlinear stochastic systems subject to control constraints. Previously, similar methods have been…
Accelerated Mirror Descent in Continuous and Discrete Time
- Computer ScienceNIPS
- 2015
It is shown that a large family of first-order accelerated methods can be obtained as a discretization of the ODE, and these methods converge at a O(1/k2) rate.
Information theoretic MPC for model-based reinforcement learning
- Computer Science2017 IEEE International Conference on Robotics and Automation (ICRA)
- 2017
An information theoretic model predictive control algorithm capable of handling complex cost criteria and general nonlinear dynamics and using multi-layer neural networks as dynamics models to solve model-based reinforcement learning tasks is introduced.
Adam: A Method for Stochastic Optimization
- Computer ScienceICLR
- 2015
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.