Performance bounds for λ policy iteration and application to the game of Tetris

@article{Scherrer2013PerformanceBF,
  title={Performance bounds for λ policy iteration and application to the game of Tetris},
  author={Bruno Scherrer},
  journal={J. Mach. Learn. Res.},
  year={2013},
  volume={14},
  pages={1181-1227}
}
We consider the discrete-time infinite-horizon optimal control problem formalized by Markov decision processes (Puterman, 1994; Bertsekas and Tsitsiklis, 1996). We revisit the work of Bertsekas and Ioffe (1996), that introduced λ policy iteration--a family of algorithms parametrized by a parameter λ--that generalizes the standard algorithms value and policy iteration, and has some deep connections with the temporal-difference algorithms described by Sutton and Barto (1998). We deepen the… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 10 CITATIONS

Budgeted Classification-based Policy Iteration

VIEW 5 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

References

Publications referenced by this paper.
SHOWING 1-10 OF 24 REFERENCES

Similar Papers

Loading similar papers…