Author pages are created from data sourced from our academic publisher partnerships and public sources.
- Publications
- Influence
Bandit Based Monte-Carlo Planning
- L. Kocsis, Csaba Szepesvari
- Computer Science, Mathematics
- ECML
- 18 September 2006
TLDR
Improved Algorithms for Linear Stochastic Bandits
- Yasin Abbasi-Yadkori, D. Pál, Csaba Szepesvari
- Mathematics, Computer Science
- NIPS
- 12 December 2011
TLDR
Fast gradient-descent methods for temporal-difference learning with linear function approximation
- R. Sutton, H. Maei, +4 authors Eric Wiewiora
- Mathematics, Computer Science
- ICML '09
- 14 June 2009
TLDR
Algorithms for Reinforcement Learning
- Csaba Szepesvari
- Computer Science
- Algorithms for Reinforcement Learning
- 25 June 2010
TLDR
Regret Bounds for the Adaptive Control of Linear Quadratic Systems
- Yasin Abbasi-Yadkori, Csaba Szepesvari
- Mathematics, Computer Science
- COLT
- 21 December 2011
TLDR
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- J. Audibert, R. Munos, Csaba Szepesvari
- Mathematics, Computer Science
- Theor. Comput. Sci.
- 1 April 2009
TLDR
Parametric Bandits: The Generalized Linear Case
- Sarah Filippi, O. Cappé, Aurélien Garivier, Csaba Szepesvari
- Mathematics, Computer Science
- NIPS
- 6 December 2010
TLDR
Finite-Time Bounds for Fitted Value Iteration
- R. Munos, Csaba Szepesvari
- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 1 June 2008
TLDR
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- A. Antos, Csaba Szepesvari, R. Munos
- Mathematics, Computer Science
- Machine Learning
- 1 April 2008
TLDR
A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation
- R. Sutton, Csaba Szepesvari, H. Maei
- Mathematics, Computer Science
- NIPS
- 2008
TLDR