Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

@article{AlTamimi2008DiscreteTimeNH,
  title={Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof},
  author={Asma Al-Tamimi and Frank L. Lewis and Murad Abu-Khalaf},
  journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)},
  year={2008},
  volume={38},
  pages={943-949}
}
  • A. Al-Tamimi, F. Lewis, M. Abu-Khalaf
  • Published 1 August 2008
  • Mathematics, Computer Science, Medicine
  • IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural… Expand
Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems
  • Derong Liu, Q. Wei
  • Computer Science, Medicine
  • IEEE Transactions on Neural Networks and Learning Systems
  • 2014
TLDR
It is shown that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation and it is proven that any of the iteratives control laws can stabilize the nonlinear systems. Expand
On-policy Approximate Dynamic Programming for Optimal Control of non-linear systems
TLDR
The paper employs the approximate dynamic programming method to solve the HJB equation for the deterministic nonlinear discrete-time systems in continuous state and action space and implements the policy iteration algorithm which has the framework of actor-critic architecture. Expand
Approximate Dynamic Programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems
TLDR
A new algorithm for realization of approximate dynamic programming (ADP) with Gaussian processes (GPs) for continuous-time (CT) nonlinear input-affine systems is proposed to infinite horizon optimal control problems. Expand
Data-based approximate policy iteration for nonlinear continuous-time optimal control design
TLDR
A model-free policy iteration algorithm is derived for constrained optimal control problem and its convergence is proved, which can learn the solution of HJB equation and optimal control policy without requiring any knowledge of system mathematical model. Expand
Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach
TLDR
A data-based finite-horizon optimal control approach for discrete-time nonlinear affine systems and the convergence of the iterative ADP algorithm and the stability of the weight estimation errors based on the HDP structure are intensively analyzed. Expand
Policy Iteration for Optimal Control of Discrete-Time Nonlinear Systems
TLDR
It is shown that the iterative value function is nonincreasingly convergent to the optimal solution of the Bellman equation, and it is proven that any of the Iterative control laws can stabilize the nonlinear system. Expand
Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence
TLDR
The need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training. Expand
Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
TLDR
This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique by using a data-based approximate policy iteration (API) method by using real system data rather than a system model. Expand
Value Iteration ADP for Discrete-Time Nonlinear Systems
TLDR
An iterative \(\theta \)-ADP algorithm is developed to solve the optimal control problem of infinite-horizon discrete-time nonlinear systems, which shows that each of the iterative controls can stabilize the nonlinear system and the condition of initial admissible control is avoided effectively. Expand
Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update
TLDR
The Hamilton-Jacobi-Bellman equation is solved forward-in-time for the optimal control of a class of general affine nonlinear discrete-time systems without using value and policy iterations and the end result is the systematic design of an optimal controller with guaranteed convergence that is suitable for hardware implementation. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 66 REFERENCES
Neural Network -based Nearly Optimal Hamilton-Jacobi-Bellman Solution for Affine Nonlinear Discrete-Time Systems
In this paper, we consider the use of nonlinear networks towards obtaining nearly optimal solutions to the control of nonlinear discrete-time systems. The method is based on least-squares successiveExpand
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
TLDR
It is shown that the constrained optimal control law has the largest region of asymptotic stability (RAS) and the result is a nearly optimal constrained state feedback controller that has been tuned a priori off-line. Expand
Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
TLDR
It is proven that the algorithm ends up to be a model-free iterative algorithm to solve the (GARE) of the linear quadratic discrete-time zero-sum game. Expand
Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control
In this paper, the optimal strategies for discrete-time linear system quadratic zero-sum games related to the H-infinity optimal control problem are solved in forward time without knowing the systemExpand
An algorithm to solve the discrete HJI equation arising in the L2 gain optimization problem
A synthesis of the discrete non-linear H control law boils down to the solution of a set of algebraic and partial differential equations known as the discrete Hamilton-Jacobi-Isaacs (DHJI) equation,Expand
H∞-control of discrete-time nonlinear systems
This paper presents an explicit solution to the problem of disturbance attenuation with internal stability via full information feedback, state feedback, and dynamic output feedback, respectively,Expand
Adaptive dynamic programming
TLDR
An adaptive dynamic programming algorithm (ADPA) is described which fuses soft computing techniques to learn the optimal cost functional for a stabilizable nonlinear system with unknown dynamics and hard Computing techniques to verify the stability and convergence of the algorithm. Expand
Hamilton-Jacobi-Isaacs formulation for constrained input nonlinear systems
In this paper, we consider the H/sub /spl infin// nonlinear state feedback control of constrained input systems. The input constraints are encoded via a quasi-norm that enables applying quasi L/subExpand
Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control
In this correspondence, adaptive critic approximate dynamic programming designs are derived to solve the discrete-time zero-sum game in which the state and action spaces are continuous. This resultsExpand
Adaptive linear quadratic control using policy iteration
TLDR
The stability and convergence results for dynamic programming-based reinforcement learning applied to linear quadratic regulation (LQR) are presented and the specific algorithm is based on Q-learning and it is proven to converge to an optimal controller. Expand
...
1
2
3
4
5
...