# Stochastic approximation for efficient LSTD and least squares regression

We propose stochastic approximation based methods with randomization of samples in two different settings - one for policy evaluation using the least squares temporal difference (LSTD) algorithm and the other for solving the least squares problem. We consider a "big data" regime where both the dimension, d, of the data and the number, T, of training samples are large. Through finite time analyses we provide performance bounds for these methods both in high probability and in expectation.

