Corpus ID: 29162355

Projection-Free Bandit Convex Optimization

@inproceedings{Chen2019ProjectionFreeBC,
  title={Projection-Free Bandit Convex Optimization},
  author={Lin Chen and Mingrui Zhang and Amin Karbasi},
  booktitle={AISTATS},
  year={2019}
}
In this paper, we propose the first computationally efficient projection-free algorithm for bandit convex optimization (BCO). We show that our algorithm achieves a sublinear regret of $O(nT^{4/5})$ (where $T$ is the horizon and $n$ is the dimension) for any bounded convex functions with uniformly bounded gradients. We also evaluate the performance of our algorithm against baselines on both synthetic and real data sets for quadratic programming, portfolio selection and matrix completion problems… Expand
Projection-Free Bandit Optimization with Privacy Guarantees
TLDR
This is the first differentially-private algorithm for projection-free bandit optimization, and in fact its bound matches the best known non-private projection- free algorithm and the bestknown private algorithm, even for the weaker setting when projections are available. Expand
Improved Regret Bounds for Projection-free Bandit Convex Optimization
TLDR
The challenge of designing online algorithms for the bandit convex optimization problem (BCO) is revisited and the first such algorithm that attains expected regret is presented, using only overall calls to the linear optimization oracle, in expectation, where T is the number of prediction rounds. Expand
Structured Projection-free Online Convex Optimization with Multi-point Bandit Feedback
We consider structured online convex optimization (OCO) with bandit feedback, where either the loss function is smooth or the constraint set is strongly convex. Projectionfree methods are among theExpand
Projection-free Online Learning over Strongly Convex Sets
TLDR
This paper proves that OFW enjoys a better regret bound of $O(T^{2/3})$ for general convex losses and proposes a strongly convex variant of OFW by redefining the surrogate loss function in OFW. Expand
An Optimal Algorithm for Bandit Convex Optimization with Strongly-Convex and Smooth Loss
  • Shinji Ito
  • Mathematics, Computer Science
  • AISTATS
  • 2020
TLDR
This study introduces an algorithm that achieves an optimal regret bound of Õ(d √ T ) under a mild assumption, without self-concordant barriers, for non-stochastic bandit convex optimization with strongly-convex and smooth loss functions. Expand
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization
TLDR
A new algorithm OPTPBCO is proposed that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively, guaranteeing the optimal regret bound mentioned above, up to additional logarithmic factors. Expand
Projection-free Online Learning in Dynamic Environments
TLDR
This paper improves an existing projection-free algorithm called online conditional gradient (OCG) to enjoy small dynamic regret bounds with the prior knowledge of VT, and achieves an O(max{T V 1/3 T , √ T}) dynamic regret bound for convex functions and an O (max{ √ TVT log T , log T} dynamic regret Bound for strongly conveX functions. Expand
Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback
TLDR
Bandit-Frank-Wolfe is the first bandit algorithm for continuous DR-submodular maximization, which achieves a $(1-1/e)-regret bound of $O(T^{8/9})$ in the responsive bandit setting. Expand
Projection-free Distributed Online Learning with Strongly Convex Losses
TLDR
It is demonstrated that the regret of distributed online algorithms with C communication rounds has a lower bound of Ω(T/C), even when the loss functions are strongly convex, which implies that the O(T ) communication complexity of the proposed algorithm is nearly optimal for obtaining the O (T 2/3 log T ) regret bound up to polylogarithmic factors. Expand
Online Boosting with Bandit Feedback
TLDR
An efficient regret minimization method is given that has two implications: an online boosting algorithm with noisy multi-point bandit feedback, and a new projection-free online convex optimization algorithm with stochastic gradient that improves state-of-the-art guarantees in terms of efficiency. Expand
...
1
2
...

References

SHOWING 1-10 OF 49 REFERENCES
An optimal algorithm for bandit convex optimization
TLDR
This work gives the first $\tilde{O}(\sqrt{T})$-regret algorithm for this setting based on a novel application of the ellipsoid method to online learning, which is known to be tight up to logarithmic factors. Expand
Stochastic Convex Optimization with Bandit Feedback
TLDR
This paper addresses the problem of minimizing a convex, Lipschitz function f over a conveX, compact set χ under a stochastic bandit feedback model and demonstrates a generalization of the ellipsoid algorithm that incurs O(poly (d) √T) regret. Expand
Bandit Convex Optimization : √ T Regret in One Dimension
We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is Θ̃( √ T ) and partially resolve aExpand
Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization
TLDR
This work introduces an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O∗( √ T ) regret and presents a novel connection between online learning and interior point methods. Expand
Bandit Convex Optimization: Towards Tight Bounds
TLDR
This paper gives an efficient and near-optimal regret algorithm for BCO with strongly-convex and smooth loss functions and employs an exploration scheme that shrinks with time. Expand
Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff
TLDR
This work presents an efficient algorithm for the bandit smooth convex optimization problem that guarantees a regret of O(T5/8), which rules out an Ω(T2/3) lower bound and takes a significant step towards the resolution of this open problem. Expand
Bandit Convex Optimization: \(\sqrt{T}\) Regret in One Dimension
TLDR
Focusing on the one-dimensional case, it is proved that the minimax regret is $\widetilde\Theta(\sqrt{T})$ and partially resolve a decade-old open problem. Expand
Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback
TLDR
The first algorithm whose expected regret is O(T ), ignoring constant and logarithmic factors is given, building upon existing work on selfconcordant regularizers and one-point gradient estimation. Expand
Multi-scale exploration of convex functions and bandit convex optimization
TLDR
This paper uses a new map from a convex function to a distribution on its domain, with the property that this distribution is a multi-scale exploration of the function, to solve a decade-old open problem in adversarial bandit convex optimization. Expand
On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization
  • O. Shamir
  • Computer Science, Mathematics
  • COLT
  • 2013
TLDR
The attainable error/regret in the bandit and derivative-free settings, as a function of the dimension d and the available number of queries T is investigated, and a precise characterization of the attainable performance for strongly-convex and smooth functions is provided. Expand
...
1
2
3
4
5
...