A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.Expand

The problem of a search engine trying to assign a sequence of search keywords to a set of competing bidders, each with a daily spending limit, is considered, and the current literature on this problem is extended by considering the setting where the keywords arrive in a random order.Expand

This paper presents an algorithm which achieves O*(n3/2 √T) regret and presents lower bounds showing that this gap is at least √n, which is conjecture to be the correct order.Expand

This paper eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings, and improves on the previous algorithm [8] whose regret is bounded in expectation against an oblivious adversary.Expand

This work considers the Ising model, hard-core lattice gas model, and graph colorings, relating the mixing time of the Glauber dynamics to the maximum eigenvalue for the adjacency matrix of the graph.Expand

We prove that any Markov chain that performs local, reversible updates on randomly chosen vertices of a bounded-degree graph necessarily has mixing time at least /spl Omega/(n log n), where it is the… Expand

It is proved that, for a large class of full-information online optimization problems, the optimal regret against an adaptive adversary is the same as against a non-adaptive adversary.Expand

It is proved the Glauber dynamics is close to the uniform distribution after O(n log n) steps whenever k > (1 + /spl epsiv/)/spl Delta/, for all /Spl epsIV/ > 0.Expand

We introduce a reduction-based model for analyzing supervised learning tasks. We use this model to devise a new reduction from multi-class cost-sensitive classification to binary classification with… Expand

Let X = (X0;:::;Xn) be a discrete-time martingale taking values in any real Euclidean space such that X0 = 0 and for all n, kXn Xn 1k 1. We prove the large deviation bound Pr [kXnk a] < 2e 1 (a 1) 2… Expand