Predictive Control with Learning-Based Terminal Costs Using Approximate Value Iteration

  title={Predictive Control with Learning-Based Terminal Costs Using Approximate Value Iteration},
  author={Francisco Moreno-Mora and Lukas Beckenbach and Stefan Streif},
: Stability under model predictive control (MPC) schemes is frequently ensured by terminal ingredients. Employing a (control) Lyapunov function as the terminal cost constitutes a common choice. Learning-based methods may be used to construct the terminal cost by relating it to, for instance, an infinite-horizon optimal control problem in which the optimal cost is a Lyapunov function. Value iteration, an approximate dynamic programming (ADP) approach, refers to one particular cost approximation… 

Figures from this paper

Constrained model predictive control: Stability and optimality

A Quasi-Infinite Horizon Nonlinear Model Predictive Control Scheme with Guaranteed Stability *

--We present in this paper a novel nonlinear model predictive control scheme that guarantees asymptotic c1osedloop stability. The scheme can be applied to both stable and unstable systems with input

Stability and performance in MPC using a finite-tail cost

The main practical benefit of the considered finite-tail cost MPC formulation is the simpler offline design in combination with typically significantly less restrictive bounds on the prediction horizon to ensure stability.

When to stop value iteration: stability and near-optimality versus computation

The considered class of stopping criteria encompasses those encountered in the control, dynamic programming and reinforcement learning literature and it allows considering new ones, which may be useful to further reduce the computational cost while endowing and satisfying stability and near-optimality properties.

Revisiting Approximate Dynamic Programming and its Convergence

  • A. Heydari
  • Mathematics
    IEEE Transactions on Cybernetics
  • 2014
Value iteration-based approximate/adaptive dynamic programming (ADP) as an approximate solution to infinite-horizon optimal control problems with deterministic dynamics and continuous state and action spaces is investigated and a relatively simple proof for the convergence of the outer-loop iterations to the optimal solution is provided.

Theoretical and Numerical Analysis of Approximate Dynamic Programming with Approximation Errors

Convergence of Value Iteration scheme of ADP for deterministic nonlinear optimal control problems with undiscounted cost functions is investigated while considering the errors existing in approximating respective functions.

Stabilizing receding-horizon control of nonlinear time-varying systems

A receding horizon control scheme for nonlinear time-varying systems is proposed which is based on a finite-horizon optimization problem with a terminal state penalty and ensures exponential stability of the equilibrium.

Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming

  • D. Bertsekas
  • Mathematics
    IEEE Transactions on Neural Networks and Learning Systems
  • 2017
Under very general assumptions, the uniqueness of the solution of Bellman’s equation is established, and convergence results for value and policy iterations are provided.

Adaptive Critic-Based Solution to an Orbital Rendezvous Problem

T HE optimal continuous thrust rendezvous maneuver is investigated in this Note. The objective is for a rigid spacecraft to perform a maneuver into a destination orbit within a given final time.