Learning with CVaR-based feedback under potentially heavy tails
@article{Holland2020LearningWC, title={Learning with CVaR-based feedback under potentially heavy tails}, author={Matthew J. Holland and El Mehdi Haress}, journal={ArXiv}, year={2020}, volume={abs/2006.02001} }
We study learning algorithms that seek to minimize the conditional value-at-risk (CVaR), when all the learner knows is that the losses incurred may be heavy-tailed. We begin by studying a general-purpose estimator of CVaR for potentially heavy-tailed random variables, which is easy to implement in practice, and requires nothing more than finite variance and a distribution function that does not change too fast or slow around just the quantile of interest. With this estimator in hand, we then…
2 Citations
Optimal Thompson Sampling strategies for support-aware CVaR bandits
- Computer ScienceICML
- 2021
Results show that B-CVTS for continuous bounded rewards and M- CVTS for multinomial distributions are the first to provably achieve asymptotic optimality in CVaR bandits, matching the corresponding asymPTotic lower bounds for this setting.
A General Framework for Bandit Problems Beyond Cumulative Objectives.
- Computer Science
- 2018
This work derives general conditions under which the oracle policy is sufficiently tractable to facilitate the design of optimism-based (upper confidence bound) learning policies, and elucidate an interesting interplay between the arm reward distributions and the performance metric.
References
SHOWING 1-10 OF 40 REFERENCES
Statistical Learning with Conditional Value at Risk
- Computer ScienceArXiv
- 2020
We propose a risk-averse statistical learning framework wherein the performance of a learning algorithm is evaluated by the conditional value-at-risk (CVaR) of losses rather than the expected loss.…
Improved scalability under heavy tails, without strong convexity
- Computer ScienceArXiv
- 2020
This work studies a simple algorithmic strategy that can be leveraged when both losses and gradients can be heavy-tailed, and introduces a simple robust validation sub-routine, which is used to boost the confidence of inexpensive gradient-based sub-processes.
Better generalization with less data using robust gradient descent
- Computer ScienceICML
- 2019
A technique which uses a cheap and robust iterative estimate of the risk gradient, which can be easily fed into any steepest descent procedure, and which illustrates that more efficient and reliable learning is possible without prior knowledge of the loss tails.
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions
- Mathematics, Computer ScienceICML
- 2020
This work derives concentration bounds for CVaR estimates, considering separately the cases of light-tailed and heavy-tailed distributions, and derives an upper-bound on the probability of incorrect identification by the proposed algorithm.
Loss Minimization and Parameter Estimation with Heavy Tails
- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2016
The technique can be used for approximate minimization of smooth and strongly convex losses, and specifically for least squares linear regression and low-rank covariance matrix estimation with similar allowances on the noise and covariate distributions.
Risk-aware Multi-armed Bandits Using Conditional Value-at-Risk
- Computer ScienceArXiv
- 2019
This work derives a novel two-sided concentration bound for a well-known CVaR estimator using empirical distribution function, assuming that the underlying distribution is unbounded, but either sub-Gaussian or light-tailed, and derives an upper-bound on the probability of incorrect identification of the proposed algorithm.
Bandits With Heavy Tail
- Computer Science, MathematicsIEEE Transactions on Information Theory
- 2013
This paper examines the bandit problem under the weaker assumption that the distributions have moments of order 1 + ε, and derives matching lower bounds that show that the best achievable regret deteriorates when ε <; 1.
PAC-Bayes under potentially heavy tails
- Computer ScienceNeurIPS
- 2019
We derive PAC-Bayesian learning guarantees for heavy-tailed losses, and obtain a novel optimal Gibbs posterior which enjoys finite-sample excess risk bounds at logarithmic confidence. Our core…
Concentration bounds for empirical conditional value-at-risk: The unbounded case
- MathematicsOper. Res. Lett.
- 2019
Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards
- Computer Science, MathematicsNeurIPS
- 2019
A class of distribution oblivious algorithms are provided with provable upper bounds on the probability of incorrect identification and perform competitively when compared with non-oblivious algorithms, suggesting that distribution obliviousness can be realised in practice without incurring a significant loss of performance.