Optimal Tuning of Continual Online Exploration in Reinforcement Learning

  title={Optimal Tuning of Continual Online Exploration in Reinforcement Learning},
  author={Youssef Achbany and François Fouss and Luh Yen and Alain Pirotte and Marco Saerens},
This paper presents a framework allowing to tune continual exploration in an optimal way. It first quantifies the rate of exploration by defining the degree of exploration of a state as the probability-distribution entropy for choosing an admissible action. Then, the exploration/exploitation tradeoff is stated as a global optimization problem: find the exploration strategy that minimizes the expected cumulated cost, while maintaining fixed degrees of exploration at same nodes. In other words… CONTINUE READING
Highly Cited
This paper has 32 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 14 extracted citations


Publications referenced by this paper.
Showing 1-10 of 24 references

Adaptation for changing stochastic environments through online pomdp policy learning

  • G. Shani, R. Brafman, S. Shimony
  • In Workshop on Reinforcement Learning in…
  • 2005
1 Excerpt

Probabilistic Robotics

  • S. Thrun, W. Burgard, D. Fox
  • MIT Press,
  • 2005
1 Excerpt

Similar Papers

Loading similar papers…