Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control

Abstract

We propose stochastic approximation based methods with randomization of samples in two different settings one for policy evaluation using the least squares temporal difference (LSTD) algorithm and the other for solving the least squares problem. We consider a “big data” regime where both the dimension, d, of the data and the number, T, of training samples… (More)
DOI: 10.1007/978-3-662-44851-9_5

Topics

Cite this paper

@inproceedings{A2014FastLU, title={Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control}, author={Prashanth L. A. and Nathaniel Korda and R{\'e}mi Munos}, booktitle={ECML/PKDD}, year={2014} }