Acceleration of Gradient-Based Path Integral Method for Efficient Optimal and Inverse Optimal Control

  title={Acceleration of Gradient-Based Path Integral Method for Efficient Optimal and Inverse Optimal Control},
  author={Masashi Okada and Tadahiro Taniguchi},
  journal={2018 IEEE International Conference on Robotics and Automation (ICRA)},
  • Masashi OkadaT. Taniguchi
  • Published 18 October 2017
  • Computer Science
  • 2018 IEEE International Conference on Robotics and Automation (ICRA)
This paper deals with a new accelerated path integral method, which iteratively searches optimal controls with a small number of iterations. This study is based on the recent observations that a path integral method for reinforcement learning can be interpreted as gradient descent. This observation also applies to an iterative path integral method for optimal control, which sets a convincing argument for utilizing various optimization methods for gradient descent, such as momentum-based… 

Figures and Tables from this paper

Model Predictive Optimized Path Integral Strategies

The derivation of model predictive path integral control is generalized to allow for a single joint distribution across controls in the control sequence, allowing for the implementation of adaptive importance sampling algorithms into the original importance sampling step while still maintaining the benefits of MPPI.

Real-time Sampling-based Model Predictive Control based on Reverse Kullback-Leibler Divergence and Its Adaptive Acceleration

A novel derivation from reverse Kullback-Leibler divergence is presented, which has a mode-seeking behavior and is likely to find one of the sub-optimal solutions early, and a weighted maximum likelihood estimation with positive/negative weights is obtained, solving by mirror descent (MD) algorithm.

Constrained stochastic optimal control with learned importance sampling: A path integral approach

This work proposes an algorithm capable of controlling a wide range of high-dimensional robotic systems in such challenging scenarios based on the path integral formulation of stochastic optimal control, which is extended with constraint-handling capabilities.

An Online Learning Approach to Model Predictive Control

This paper proposes a new algorithm based on dynamic mirror descent (DMD), an online learning algorithm that is designed for non-stationary setups and provides a fresh perspective on previous heuristics used in MPC and suggests a principled way to design new MPC algorithms.

Learning to Optimize in Model Predictive Control

  • J. SacksByron Boots
  • Computer Science
    2022 International Conference on Robotics and Automation (ICRA)
  • 2022
This work focuses on learning to optimize more effectively within Model Predictive Control to improve the update rule within MPC, and demonstrates that this approach can outperform a MPC controller with the same number of samples.

Model Predictive Path Integral Control Framework for Partially Observable Navigation: A Quadrotor Case Study

A generic MPPI control framework that can be used for 2D or 3D autonomous navigation tasks in either fully or partially observable environments, which are the most prevalent in robotics applications is proposed.

Variational Inference MPC for Bayesian Model-based Reinforcement Learning

A variational inference MPC is introduced, which reformulates various stochastic methods, including CEM, in a Bayesian fashion, and a novel instance of the framework, called probabilistic action ensembles with trajectory sampling (PaETS), which can involve multimodal uncertainties both in dynamics and optimal trajectories.

Learning Sampling Distributions for Model Predictive Control

This work frames the learning problem as bi-level optimization and shows how to train the controller with backpropagation-through-time and uses a normalizing parameterization of the distribution to leverage its tractable density to avoid requiring differentiability of the dynamics and cost function.

Control as Hybrid Inference

This work presents an implementation of CHI which naturally mediates the balance between iterative and amortised inference, and provides a principled framework for harnessing the sample efficiency of model-based planning while retaining the asymptotic performance ofmodel-free policy optimisation.

Reinforcement Learning as Iterative and Amortised Inference

It is demonstrated that a wide range of algorithms can be classified in this manner providing a fresh perspective and highlighting a range of existing similarities, and being able to identify parts of the algorithmic design space which have been relatively unexplored suggesting new routes to innovative RL algorithms.



A Generalized Path Integral Control Approach to Reinforcement Learning

The framework of stochastic optimal control with path integrals is used to derive a novel approach to RL with parameterized policies to demonstrate interesting similarities with previous RL research in the framework of probability matching and provides intuition why the slightly heuristically motivated probability matching approach can actually perform well.

Path Integral Networks: End-to-End Differentiable Optimal Control

Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem.

Model Predictive Path Integral Control: From Theory to Parallel Computation

The current simulations illustrate the efficiency and robustness of the proposed approach and demonstrate the advantages of computational frameworks that incorporate concepts from statistical physics, control theory, and parallelization against more traditional approaches of optimal control theory.

Mirror descent search and its acceleration

Aggressive driving with model predictive path integral control

A model predictive control algorithm designed for optimizing non-linear systems subject to complex cost criteria using a stochastic optimal control framework using a fundamental relationship between the information theoretic notions of free energy and relative entropy is presented.

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

  • E. TodorovWeiwei Li
  • Mathematics
    Proceedings of the 2005, American Control Conference, 2005.
  • 2005
We present an iterative linear-quadratic-Gaussian method for locally-optimal feedback control of nonlinear stochastic systems subject to control constraints. Previously, similar methods have been

Accelerated Mirror Descent in Continuous and Discrete Time

It is shown that a large family of first-order accelerated methods can be obtained as a discretization of the ODE, and these methods converge at a O(1/k2) rate.

Information theoretic MPC for model-based reinforcement learning

An information theoretic model predictive control algorithm capable of handling complex cost criteria and general nonlinear dynamics and using multi-layer neural networks as dynamics models to solve model-based reinforcement learning tasks is introduced.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.