We consider a generalization of stochastic bandits where the set of arms, X, is allowed to be a generic measurable space and the mean-payoff function is "locally Lipschitz" with respect to a dissimilarity function that is known to the decision maker.Expand

We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of forecasters that perform an on-line exploration of the arms.Expand

We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that perform an online exploration of the arms.Expand

We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm with multiple learning rates.Expand

We consider a generalization of stochastic bandit problems where the set of arms, Χ, is allowed to be a generic topological space in a way that is more general than Lipschitz.Expand

We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic regret matches the lower bound of Burnetas96.Expand

We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives feedback generated by the combined choice of the two players.Expand