Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven

  title={Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven},
  author={Qingtao Zhao and Jennie Si and Jian Sun},
  journal={IEEE transactions on neural networks and learning systems},
  • Qingtao Zhao, J. Si, Jian Sun
  • Published 16 June 2020
  • Computer Science
  • IEEE transactions on neural networks and learning systems
In this work, time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives. Among existing approximate dynamic programming (ADP) and reinforcement learning (RL) algorithms, the direct heuristic dynamic programming (dHDP) has been shown an effective tool as demonstrated in solving several complex learning control problems. It continuously updates the control policy and the critic as system states continuously evolve. It… 
1 Citations

Figures from this paper


Event-Triggered Optimal Neuro-Controller Design With Reinforcement Learning for Unknown Nonlinear Systems
An optimal control scheme for continuous-time unknown nonlinear systems using the event-triggering mechanism based on the Lyapunov method is developed and an identifier-critic architecture under the framework of reinforcement learning is developed.
Robust Optimal Control Scheme for Unknown Constrained-Input Nonlinear Systems via a Plug-n-Play Event-Sampled Critic-Only Algorithm
The proposed robust optimal control algorithm tunes the parameters of critic-only neural network by event-triggering condition and runs in a plug-n-play framework without system functions, where fewer transmissions and less computation are required as all the measurements received simultaneously.
Near Optimal Event-Triggered Control of Nonlinear Discrete-Time Systems Using Neurodynamic Programming
An event-triggered near optimal control of uncertain nonlinear discrete-time systems with Lyapunov technique used in conjunction with the event-trigger condition to guarantee the ultimate boundedness of the closed-loop system is presented.
Adaptive Critic Designs for Event-Triggered Robust Control of Nonlinear Systems With Unknown Dynamics
By using Lyapunov method, it is proved that the derived optimal event-triggered control (ETC) guarantees uniform ultimate boundedness of all the signals in the original system.
A boundedness result for the direct heuristic dynamic programming
Adaptive Event-Triggered Control Based on Heuristic Dynamic Programming for Nonlinear Discrete-Time Systems
A new trigger threshold for discrete-time systems is designed and a detailed Lyapunov stability analysis shows that the proposed event-triggered controller can asymptotically stabilize the discrete- time systems.
Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints
An event-triggered near optimal control structure is developed for nonlinear continuous-time systems with control constraints and an actor-critic framework is presented to solve the HJB equation.
Event-Triggered Optimal Control With Performance Guarantees Using Adaptive Dynamic Programming
It is proved that semiglobal uniform ultimate boundedness can be guaranteed for states and NN weight errors with the ADP-based ETOC, and a predetermined upper bound is provided by proving the existence of a lower bound for interexecution time.
Event-Based Robust Control for Uncertain Nonlinear Systems Using Adaptive Dynamic Programming
It is proved that the solution of the optimal control problem can asymptotically stabilize the uncertain system with an adaptive triggering condition, and the designed event-based controller is robust to the original uncertain system.
An Event-Triggered ADP Control Approach for Continuous-Time System With Unknown Internal States
A neural-network-based observer is integrated to recover the system internal states from the measurable feedback to reduce the computation cost and transmission load of the event-triggered adaptive dynamic programming control method.