Forced-Exploration Based Algorithms for Playing in Stochastic Linear Bandits

@inproceedings{AbbasiYadkori2009ForcedExplorationBA,
  title={Forced-Exploration Based Algorithms for Playing in Stochastic Linear Bandits},
  author={Yasin Abbasi-Yadkori and Csaba Szepesv{\'a}ri},
  year={2009}
}
We study stochastic linear payoff bandit problems and give a simple, computationally efficient algorithm whose regret, under certain regularity assumptions on the action set, is O(d √ T ), where d is the dimensionality of the action space and T is the time-horizon. However, this result is problem dependent and not a minimax bound. We show that our algorithm is able to achieve lower regret bounds when we have sparsity in the problem. Our experimental results support our upper bound and show that… CONTINUE READING

From This Paper

Topics from this paper.

References

Publications referenced by this paper.
Showing 1-10 of 10 references

Forced-exploration based algorithms for playing in bandits with large action sets

  • Y. Abbasi-Yadkori
  • Master’s thesis, Department of Computing Science…
  • 2009
Highly Influential
5 Excerpts

Numerical Optimization

  • J. Nocedal, S. Wright
  • Springer.
  • 2006
Highly Influential
3 Excerpts

T ) where d is the size of the parameter vector and T is time. Then we apply FEL to the ad allocation problem and show that it outperforms Confidence

  • Pandey
  • Ellipsoid of Dani et al
  • 2008

simply put ads in clusters and use a two-stage UCB1 algorithm. In the first stage, they choose the cluster and in the second stage, they choose the ad

  • Pandey
  • itself. Pandey et al
  • 2007

Similar Papers

Loading similar papers…