Moto Kamiura

  • Citations Per Year
Learn More
The principle of optimism in the face of uncertainty is known as a heuristic in sequential decision-making problems. Overtaking method based on this principle is an effective algorithm to solve multi-armed bandit problems. It was defined by a set of some heuristic patterns of the formulation in the previous study. The objective of the present paper is to(More)
A multi-armed bandit problem is a search problem on which a learning agent must select the optimal arm among multiple slot machines generating random rewards. UCB algorithm is one of the most popular methods to solve multi-armed bandit problems. It achieves logarithmic regret performance by coordinating balance between exploration and exploitation. Since(More)
  • 1