# Data-driven Rollout for Deterministic Optimal Control

@article{Li2021DatadrivenRF, title={Data-driven Rollout for Deterministic Optimal Control}, author={Yuchao Li and Karl Henrik Johansson and Jonas M{\aa}rtensson}, journal={2021 60th IEEE Conference on Decision and Control (CDC)}, year={2021}, pages={2169-2176} }

We consider deterministic infinite horizon optimal control problems with nonnegative stage costs. We draw inspiration from learning model predictive control scheme designed for continuous dynamics and iterative tasks, and propose a rollout algorithm that relies on sampled data generated by some base policy. The proposed algorithm is based on value and policy iteration ideas, and applies to deterministic problems with arbitrary state and control spaces, and arbitrary dynamics. It admits…

## 2 Citations

### Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control

- Computer ScienceArXiv
- 2021

This paper shows that the principal AlphaZero/TD-Gammon ideas of approximation in value space and rollout apply very broadly to deterministic and stochastic optimal control problems, involving both discrete and continuous search spaces.

### Newton’s method for reinforcement learning and model predictive control

- Computer ScienceResults in Control and Optimization
- 2022

## References

SHOWING 1-10 OF 31 REFERENCES

### Learning Model Predictive Control for Iterative Tasks. A Data-Driven Control Framework

- EngineeringIEEE Transactions on Automatic Control
- 2018

The control design approach is presented, and it is shown how to recursively construct terminal set and terminal cost from state and input trajectories of previous iterations.

### Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC

- MathematicsEur. J. Control
- 2005

It is shown that the most common MPC schemes can be viewed as rollout algorithms and are related to policy iteration methods, and embedded within a new unifying suboptimal control framework, based on a concept of restricted or constrained structure policies, which contains these schemes as special cases.

### Rollout Algorithms for Constrained Dynamic Programming

- Computer Science
- 2005

An extension of the rollout algorithm is derived that applies to constrained deterministic dynamic programming problems, and relies on a suboptimal policy, called base heuristic, which under suitable assumptions produces a feasible solution.

### Learning Model Predictive Control for Iterative Tasks

- EngineeringArXiv
- 2016

The paper presents the control design approach, and shows how to recursively construct terminal set and terminal cost from state and input trajectories of previous iterations of the LMPC.

### Dynamic Programming and Optimal Control

- Computer Science
- 1995

The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential…

### Cooperative distributed model predictive control for nonlinear systems

- Computer Science, Engineering
- 2011

### Optimal Infinite-Horizon Feedback Laws for a General Class of Constrained Discrete-Time Systems: Stability and Moving-Horizon Approximations

- Mathematics
- 2004

Stability results are given for a class of feedback systems arising from the regulation of time-varying discrete-time systems using optimal infinite-horizon and moving-horizon feedback laws. The…

### Negative Dynamic Programming

- Mathematics
- 1984

This paper deals with negative dynamic programming problems, i.e. discrete time total reward problems with non-positive reward functions, with countable state space, and shows that e-optimal stationary policies exist in general dynamic Programming problems if this is true for the imbedded negative model.

### Multiagent Reinforcement Learning: Rollout and Policy Iteration

- Computer ScienceIEEE/CAA Journal of Automatica Sinica
- 2021

This paper discusses autonomous multiagent rollout schemes that allow the agents to make decisions autonomously through the use of precomputed signaling information, which is sufficient to maintain the cost improvement property, without any on-line coordination of control selection between the agents.

### A Rollout Policy for the Vehicle Routing Problem with Stochastic Demands

- Computer ScienceOper. Res.
- 2001

The resulting rollout policy appears to be the first computationally tractable algorithm for approximately solving the problem under the reoptimization approach by sequentially improving a given a priori solution by means of a rollout algorithm.