# Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

@article{AlTamimi2008DiscreteTimeNH,
title={Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof},
author={Asma Al-Tamimi and Frank L. Lewis and Murad Abu-Khalaf},
journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)},
year={2008},
volume={38},
pages={943-949}
}
• Published 1 August 2008
• Mathematics, Computer Science, Medicine
• IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural… Expand
732 Citations
Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems
• Computer Science, Medicine
• IEEE Transactions on Neural Networks and Learning Systems
• 2014
It is shown that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation and it is proven that any of the iteratives control laws can stabilize the nonlinear systems. Expand
On-policy Approximate Dynamic Programming for Optimal Control of non-linear systems
• Computer Science
• 2020 7th International Conference on Control, Decision and Information Technologies (CoDIT)
• 2020
The paper employs the approximate dynamic programming method to solve the HJB equation for the deterministic nonlinear discrete-time systems in continuous state and action space and implements the policy iteration algorithm which has the framework of actor-critic architecture. Expand
Approximate Dynamic Programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems
• Computer Science
• 2020
A new algorithm for realization of approximate dynamic programming (ADP) with Gaussian processes (GPs) for continuous-time (CT) nonlinear input-affine systems is proposed to infinite horizon optimal control problems. Expand
Data-based approximate policy iteration for nonlinear continuous-time optimal control design
• Computer Science, Mathematics
• ArXiv
• 2013
A model-free policy iteration algorithm is derived for constrained optimal control problem and its convergence is proved, which can learn the solution of HJB equation and optimal control policy without requiring any knowledge of system mathematical model. Expand
Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach
• Computer Science
• IEEE Transactions on Cybernetics
• 2018
A data-based finite-horizon optimal control approach for discrete-time nonlinear affine systems and the convergence of the iterative ADP algorithm and the stability of the weight estimation errors based on the HDP structure are intensively analyzed. Expand
Policy Iteration for Optimal Control of Discrete-Time Nonlinear Systems
• Computer Science
• 2017
It is shown that the iterative value function is nonincreasingly convergent to the optimal solution of the Bellman equation, and it is proven that any of the Iterative control laws can stabilize the nonlinear system. Expand
Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence
• Computer Science, Mathematics
• Neural Networks
• 2009
The need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training. Expand
Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
• Computer Science
• Autom.
• 2014
This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique by using a data-based approximate policy iteration (API) method by using real system data rather than a system model. Expand
Value Iteration ADP for Discrete-Time Nonlinear Systems
• Computer Science
• 2017
An iterative $$\theta$$-ADP algorithm is developed to solve the optimal control problem of infinite-horizon discrete-time nonlinear systems, which shows that each of the iterative controls can stabilize the nonlinear system and the condition of initial admissible control is avoided effectively. Expand
Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update
• Mathematics, Medicine
• IEEE Transactions on Neural Networks and Learning Systems
• 2012
The Hamilton-Jacobi-Bellman equation is solved forward-in-time for the optimal control of a class of general affine nonlinear discrete-time systems without using value and policy iterations and the end result is the systematic design of an optimal controller with guaranteed convergence that is suitable for hardware implementation. Expand

#### References

SHOWING 1-10 OF 66 REFERENCES
Neural Network -based Nearly Optimal Hamilton-Jacobi-Bellman Solution for Affine Nonlinear Discrete-Time Systems
• Mathematics
• Proceedings of the 44th IEEE Conference on Decision and Control
• 2005
In this paper, we consider the use of nonlinear networks towards obtaining nearly optimal solutions to the control of nonlinear discrete-time systems. The method is based on least-squares successiveExpand
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
• Mathematics, Computer Science
• Autom.
• 2005
It is shown that the constrained optimal control law has the largest region of asymptotic stability (RAS) and the result is a nearly optimal constrained state feedback controller that has been tuned a priori off-line. Expand
Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
• Mathematics, Computer Science
• Autom.
• 2007
It is proven that the algorithm ends up to be a model-free iterative algorithm to solve the (GARE) of the linear quadratic discrete-time zero-sum game. Expand
Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control
• 2007 European Control Conference (ECC)
• 2007
In this paper, the optimal strategies for discrete-time linear system quadratic zero-sum games related to the H-infinity optimal control problem are solved in forward time without knowing the systemExpand
An algorithm to solve the discrete HJI equation arising in the L2 gain optimization problem
A synthesis of the discrete non-linear H control law boils down to the solution of a set of algebraic and partial differential equations known as the discrete Hamilton-Jacobi-Isaacs (DHJI) equation,Expand
H∞-control of discrete-time nonlinear systems
• Computer Science, Mathematics
• IEEE Trans. Autom. Control.
• 1996
This paper presents an explicit solution to the problem of disturbance attenuation with internal stability via full information feedback, state feedback, and dynamic output feedback, respectively,Expand
Adaptive dynamic programming
• Mathematics, Computer Science
• IEEE Trans. Syst. Man Cybern. Part C
• 2002
An adaptive dynamic programming algorithm (ADPA) is described which fuses soft computing techniques to learn the optimal cost functional for a stabilizable nonlinear system with unknown dynamics and hard Computing techniques to verify the stability and convergence of the algorithm. Expand
Hamilton-Jacobi-Isaacs formulation for constrained input nonlinear systems
• Mathematics
• 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)
• 2004
In this paper, we consider the H/sub /spl infin// nonlinear state feedback control of constrained input systems. The input constraints are encoded via a quasi-norm that enables applying quasi L/subExpand
Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control
• Mathematics, Computer Science
• IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
• 2007
In this correspondence, adaptive critic approximate dynamic programming designs are derived to solve the discrete-time zero-sum game in which the state and action spaces are continuous. This resultsExpand
Adaptive linear quadratic control using policy iteration
• Computer Science
• Proceedings of 1994 American Control Conference - ACC '94
• 1994
The stability and convergence results for dynamic programming-based reinforcement learning applied to linear quadratic regulation (LQR) are presented and the specific algorithm is based on Q-learning and it is proven to converge to an optimal controller. Expand