Share This Author
Contextual Bandits with Linear Payoff Functions
An O (√ Td ln (KT ln(T )/δ) ) regret bound is proved that holds with probability 1− δ for the simplest known upper confidence bound algorithm for this problem.
Contextual Bandit Algorithms with Supervised Learning Guarantees
- A. Beygelzimer, J. Langford, Lihong Li, L. Reyzin, R. Schapire
- Computer ScienceAISTATS
- 21 February 2010
We address the problem of competing with any large set of N policies in the nonstochastic bandit setting, where the learner must repeatedly select among K actions but observes only the reward of the…
How boosting the margin can also boost classifier complexity
A close look at Breiman's compelling but puzzling results finds that the poorer performance of arc-gv can be explained by the increased complexity of the base classifiers it uses, an explanation supported by experiments and entirely consistent with the margins theory.
Efficient Optimal Learning for Contextual Bandits
This work provides the first efficient algorithm with an optimal regret and uses a cost sensitive classification learner as an oracle and has a running time polylog(N), where N is the number of classification rules among which the oracle might choose.
Statistical Algorithms and a Lower Bound for Detecting Planted Cliques
- V. Feldman, Elena Grigorescu, L. Reyzin, S. Vempala, Ying Xiao
- Computer Science, MathematicsJ. ACM
- 5 January 2012
The main application is a nearly optimal lower bound on the complexity of any statistical query algorithm for detecting planted bipartite clique distributions when the planted clique has size O(n1/2 − δ) for any constant δ > 0.
Non-Stochastic Bandit Slate Problems
We consider bandit problems, motivated by applications in online advertising and news story selection, in which the learner must repeatedly select a slate, that is, a subset of size s from K possible…
Inferring Social Networks from Outbreaks
This work considers the problem of inferring the most likely social network given connectivity constraints imposed by observations of outbreaks within the network, and proves an Ω(log(n)) hardness of approximation result for uniform cost networks and gives an algorithm that almost matches this bound, even for arbitrary costs.
Data stability in clustering: A closer look
Anti-coordination Games and Stable Graph Colorings
It is proved the decision problem for non-strict equilibria in directed graphs is NP-hard and notions of stable and strictly stable colorings in graphs are defined, which resolve some open problems in these areas.
Learning and Verifying Graphs Using Queries with a Focus on Edge Counting
We consider the problem of learning and verifying hidden graphs and their properties given query access to the graphs. We analyze various queries (edge detection, edge counting, shortest path), but…