Reinforcement learning for long-run average cost

  title={Reinforcement learning for long-run average cost},
  author={Abhijit Gosavi},
  journal={European Journal of Operational Research},
A large class of sequential decision-making problems under uncertainty can be modeled as Markov and Semi-Markov Decision Problems, when their underlying probability structure has a Markov chain. They may be solved by using classical dynamic programming methods. However, dynamic programming methods suffer from the curse of dimensionality and break down rapidly in face of large state spaces. In addition, dynamic programming methods require the exact computation of the so-called transition… CONTINUE READING
Highly Influential
This paper has highly influenced 14 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 84 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 52 extracted citations

A Multiresolution Analysis-Assisted Reinforcement Learning Approach to Run-by-Run Control

IEEE Transactions on Automation Science and Engineering • 2007
View 5 Excerpts
Highly Influenced

Investigation of Q-Learning in the Context of a Virtual Learning Environment

Informatics in Education • 2007
View 7 Excerpts
Highly Influenced

Natural Gradient Policy for Average Cost SMDP Problem

19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007) • 2007
View 5 Excerpts
Highly Influenced

85 Citations

Citations per Year
Semantic Scholar estimates that this publication has 85 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 24 references

Asynchronous Stochastic Approximations

View 13 Excerpts
Highly Influenced

On the Theory of Dynamic Programming.

Proceedings of the National Academy of Sciences of the United States of America • 1952
View 15 Excerpts
Highly Influenced

On the Convergence of Some Reinforcement Learning Algorithms

A. Gosavi
Working paper, Department of Engineering, • 2000
View 4 Excerpts
Highly Influenced

Ode analysis for Q-learning algorithms

J. Abounadi, D. Bertsekas, V. Borkar
LIDS Report, • 1996
View 7 Excerpts
Highly Influenced

An algorithm for solving semi-markov decision problems using reinforcement learning: Convergence analysis and numerical results

A. Gosavi
Unpublished Ph.D. Dissertation, Department of Industrial and Management Systems Engineering, • 1999
View 2 Excerpts

Reinforcement Learning: An Introduction

IEEE Transactions on Neural Networks • 1998
View 1 Excerpt

An analog scheme for fixed point computation, Part i:theory

V. S. Borkar, K. Soumyanath
IEEE Transactions Circuits and Systems I. Fundamental Theory and Appl., • 1997
View 1 Excerpt

Similar Papers

Loading similar papers…