Control of exploitation-exploration meta-parameter in reinforcement learning

@article{Ishii2002ControlOE,
  title={Control of exploitation-exploration meta-parameter in reinforcement learning},
  author={Shin Ishii and Wako Yoshida and Junichiro Yoshimoto},
  journal={Neural networks : the official journal of the International Neural Network Society},
  year={2002},
  volume={15 4-6},
  pages={
          665-87
        }
}
In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance between exploitation and exploration. Our learning scheme is based on model-based RL, in which the Bayes inference with forgetting effect estimates the state-transition probability of the environment. The balance parameter, which corresponds to the randomness in action selection, is controlled based on variation of action… CONTINUE READING
BETA

Similar Papers

Citations

Publications citing this paper.
SHOWING 1-10 OF 96 CITATIONS, ESTIMATED 87% COVERAGE

Learning About Unstable, Publicly Unobservable Payoffs

VIEW 3 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Control of Unknown Nonlinear Systems With Efficient Transient Performance Using Concurrent Exploitation and Exploration

  • IEEE Transactions on Neural Networks
  • 2009
VIEW 5 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Dynamic Flexibility in Striatal-Cortical Circuits Supports Reinforcement Learning.

  • The Journal of neuroscience : the official journal of the Society for Neuroscience
  • 2018
VIEW 2 EXCERPTS
CITES METHODS

FILTER CITATIONS BY YEAR

2002
2019

CITATION STATISTICS

  • 3 Highly Influenced Citations

References

Publications referenced by this paper.
SHOWING 1-10 OF 65 REFERENCES

Dorsal anterior cingulate cortex: a role in reward-based decision making.

  • Proceedings of the National Academy of Sciences of the United States of America
  • 2002

A multi-agent reinforcement learning method for a partially-observable competitive game

Y. Matsuno, T. Yamazaki, J. Matsuda, S. Ishii
  • In Proceedings of the Fifth International Conference on Autonomous Agents,
  • 2001

On-line model selection based on the variational Bayes

M. Sato
  • Neural Computation,
  • 2001