Share This Author
Stochastic Linear Optimization under Bandit Feedback
A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.
The adwords problem: online keyword matching with budgeted bidders under random permutations
The problem of a search engine trying to assign a sequence of search keywords to a set of competing bidders, each with a daily spending limit, is considered, and the current literature on this problem is extended by considering the setting where the keywords arrive in a random order.
The Price of Bandit Information for Online Optimization
This paper presents an algorithm which achieves O*(n3/2 √T) regret and presents lower bounds showing that this gap is at least √n, which is conjecture to be the correct order.
High-Probability Regret Bounds for Bandit Online Linear Optimization
- P. Bartlett, Varsha Dani, Thomas P. Hayes, S. Kakade, A. Rakhlin, Ambuj Tewari
- Computer Science, MathematicsCOLT
- 1 July 2008
This paper eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings, and improves on the previous algorithm  whose regret is bounded in expectation against an oblivious adversary.
A simple condition implying rapid mixing of single-site dynamics on spin systems
- Thomas P. Hayes
- Mathematics47th Annual IEEE Symposium on Foundations of…
- 21 October 2006
This work considers the Ising model, hard-core lattice gas model, and graph colorings, relating the mixing time of the Glauber dynamics to the maximum eigenvalue for the adjacency matrix of the graph.
A general lower bound for mixing of single-site dynamics on graphs
- Thomas P. Hayes, A. Sinclair
- Mathematics, Computer Science46th Annual IEEE Symposium on Foundations of…
- 25 July 2005
We prove that any Markov chain that performs local, reversible updates on randomly chosen vertices of a bounded-degree graph necessarily has mixing time at least /spl Omega/(n log n), where it is the…
Robbing the bandit: less regret in online geometric optimization against an adaptive adversary
It is proved that, for a large class of full-information online optimization problems, the optimal regret against an adaptive adversary is the same as against a non-adaptive adversary.
A non-Markovian coupling for randomly sampling colorings
It is proved the Glauber dynamics is close to the uniform distribution after O(n log n) steps whenever k > (1 + /spl epsiv/)/spl Delta/, for all /Spl epsIV/ > 0.
Error limiting reductions between classification tasks
- A. Beygelzimer, Varsha Dani, Thomas P. Hayes, J. Langford, B. Zadrozny
- Computer ScienceICML
- 7 August 2005
We introduce a reduction-based model for analyzing supervised learning tasks. We use this model to devise a new reduction from multi-class cost-sensitive classification to binary classification with…
A large-deviation inequality for vector-valued martingales
- Thomas P. Hayes
Let X = (X0;:::;Xn) be a discrete-time martingale taking values in any real Euclidean space such that X0 = 0 and for all n, kXn Xn 1k 1. We prove the large deviation bound Pr [kXnk a] < 2e 1 (a 1) 2…