Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters

@inproceedings{Granmo2010SolvingNB,
  title={Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters},
  author={Ole-Christoffer Granmo and Stian Berg},
  booktitle={IEA/AIE},
  year={2010}
}
The multi-armed bandit problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Dynamically changing (non-stationary) bandit problems are particularly challenging because each change of the reward distributions may progressively… CONTINUE READING
Highly Cited
This paper has 24 citations. REVIEW CITATIONS