Dual Online Stein Variational Inference for Control and Dynamics

  title={Dual Online Stein Variational Inference for Control and Dynamics},
  author={Lucas Gomes Barcelos and Alexander Lambert and Rafael Oliveira and Paulo Borges and Byron Boots and Fabio Tozeto Ramos},
Model predictive control (MPC) schemes have a proven track record for delivering aggressive and robust performance in many challenging control tasks, coping with nonlinear system dynamics, constraints, and observational noise. Despite their success, these methods often rely on simple control distributions, which can limit their performance in highly uncertain and complex environments. MPC frameworks must be able to accommodate changing distributions over system parameters, based on the most… 

Figures and Tables from this paper

Variational Inference MPC using Normalizing Flows and Out-of-Distribution Projection

A Model Predictive Control method for collision-free navigation that uses amortized variational inference to approximate the distribution of optimal control sequences by training a normalizing flow conditioned on the start, goal and environment and an approach that performs projection on the representation of the environment as part of the MPC process is presented.

Variational Inference MPC for Robot Motion with Normalizing Flows

This paper proposes using amortized variational inference to approximate the posterior with a normalizing conditioned on the start, goal and environment and demonstrates that this approach generalizes to a difficult novel environment and outperform a baseline sampling-based MPC method on a navigation problem.

Improving Sample-based MPC with Normalizing Flows & Out-of-distribution Projection

A sample-based Model Predictive Control method for collision-free navigation that uses a conditional normalizing flow as a sampling distribution, conditioned on the start, goal and environment, to learn a distribution that accounts for both the dynamics of the robot and complex obstacle geometries is proposed.

Stein Variational Probabilistic Roadmaps

This work proposes a method for Probabilistic Roadmaps which relies on particle-based Variational Inference to efficiently cover the posterior distribution over feasible regions in configuration space, and results in sample-efficient generation of planning-graphs and large improvements over traditional sampling approaches.

Robot Learning From Randomized Simulations: A Review

A comprehensive review of sim-to-real research for robotics, focusing on a technique named “domain randomization” which is a method for learning from randomized simulations.

Particle-Based Adaptive Sampling for Curriculum Learning

This thesis proposes the usage of particle-based variational inference methods (ParVIs) to more effectively sample new tasks from less covered areas of the task space and investigates two methods in the space of ParVIs called Stein variational gradient descent (SVGD) and Stein points and develops new sampling algorithms based on them to use for curriculum learning.

Robust Control Under Uncertainty via Bounded Rationality and Differential Privacy

The theory of differential privacy is used to design controllers with bounded sensitivity to errors in state estimates, and to bound the amount of state information used for control in order to impose decision-making under bounded rationality.



Stein Variational Model Predictive Control

This paper proposes a Stein variational gradient descent method to estimate the posterior directly over control parameters, given a cost function and observed state trajectories and shows that this framework leads to successful planning in challenging, non-convex optimal control problems.

Adaptive Probabilistic Trajectory Optimization via Efficient Approximate Inference

This paper proposes a new approach, adaptive probabilistic trajectory optimization, that combines the benefits of RL and MPC, and uses scalable approximate inference to learn and updates Probabilistic models in an online incremental fashion while also computing optimal control policies via successive local approximations.

Stochastic Optimal Control as Approximate Input Inference

This work develops the view of Optimal Control as Input Estimation, devising a probabilistic stochastic optimal control formulation that iteratively infers the optimal input distributions by minimizing an upper bound of the control cost.

Variational Inference MPC for Bayesian Model-based Reinforcement Learning

A variational inference MPC is introduced, which reformulates various stochastic methods, including CEM, in a Bayesian fashion, and a novel instance of the framework, called probabilistic action ensembles with trajectory sampling (PaETS), which can involve multimodal uncertainties both in dynamics and optimal trajectories.

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

This paper proposes a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation, which matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples.

Deep Learning Tubes for Tube MPC

A deep quantile regression framework for control is introduced that enforces probabilistic quantile bounds and quantifies epistemic uncertainty in learning-based control models.

On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference

We present a reformulation of the stochastic optimal control problem in terms of KL divergence minimisation, not only providing a unifying perspective of previous approaches in this area, but also

DISCO: Double Likelihood-free Inference Stochastic Control

This paper proposes to leverage the power of modern simulators and recent techniques in Bayesian statistics for likelihood-free inference to design a control framework that is efficient and robust with respect to the uncertainty over simulation parameters.

Bayesian model predictive control: Efficient model exploration and regret bounds using posterior sampling

This work presents a learning-based MPC formulation using posterior sampling techniques, which provides finite-time regret bounds on the learning performance while being simple to implement using off-the-shelf MPC software and algorithms.

An Online Learning Approach to Model Predictive Control

This paper proposes a new algorithm based on dynamic mirror descent (DMD), an online learning algorithm that is designed for non-stationary setups and provides a fresh perspective on previous heuristics used in MPC and suggests a principled way to design new MPC algorithms.