• Corpus ID: 219260854

Learning with CVaR-based feedback under potentially heavy tails

  title={Learning with CVaR-based feedback under potentially heavy tails},
  author={Matthew J. Holland and El Mehdi Haress},
We study learning algorithms that seek to minimize the conditional value-at-risk (CVaR), when all the learner knows is that the losses incurred may be heavy-tailed. We begin by studying a general-purpose estimator of CVaR for potentially heavy-tailed random variables, which is easy to implement in practice, and requires nothing more than finite variance and a distribution function that does not change too fast or slow around just the quantile of interest. With this estimator in hand, we then… 

Figures from this paper

Optimal Thompson Sampling strategies for support-aware CVaR bandits
Results show that B-CVTS for continuous bounded rewards and M- CVTS for multinomial distributions are the first to provably achieve asymptotic optimality in CVaR bandits, matching the corresponding asymPTotic lower bounds for this setting.
A General Framework for Bandit Problems Beyond Cumulative Objectives.
This work derives general conditions under which the oracle policy is sufficiently tractable to facilitate the design of optimism-based (upper confidence bound) learning policies, and elucidate an interesting interplay between the arm reward distributions and the performance metric.


Statistical Learning with Conditional Value at Risk
We propose a risk-averse statistical learning framework wherein the performance of a learning algorithm is evaluated by the conditional value-at-risk (CVaR) of losses rather than the expected loss.
Improved scalability under heavy tails, without strong convexity
This work studies a simple algorithmic strategy that can be leveraged when both losses and gradients can be heavy-tailed, and introduces a simple robust validation sub-routine, which is used to boost the confidence of inexpensive gradient-based sub-processes.
Better generalization with less data using robust gradient descent
A technique which uses a cheap and robust iterative estimate of the risk gradient, which can be easily fed into any steepest descent procedure, and which illustrates that more efficient and reliable learning is possible without prior knowledge of the loss tails.
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions
This work derives concentration bounds for CVaR estimates, considering separately the cases of light-tailed and heavy-tailed distributions, and derives an upper-bound on the probability of incorrect identification by the proposed algorithm.
Loss Minimization and Parameter Estimation with Heavy Tails
The technique can be used for approximate minimization of smooth and strongly convex losses, and specifically for least squares linear regression and low-rank covariance matrix estimation with similar allowances on the noise and covariate distributions.
Risk-aware Multi-armed Bandits Using Conditional Value-at-Risk
This work derives a novel two-sided concentration bound for a well-known CVaR estimator using empirical distribution function, assuming that the underlying distribution is unbounded, but either sub-Gaussian or light-tailed, and derives an upper-bound on the probability of incorrect identification of the proposed algorithm.
Bandits With Heavy Tail
This paper examines the bandit problem under the weaker assumption that the distributions have moments of order 1 + ε, and derives matching lower bounds that show that the best achievable regret deteriorates when ε <; 1.
PAC-Bayes under potentially heavy tails
We derive PAC-Bayesian learning guarantees for heavy-tailed losses, and obtain a novel optimal Gibbs posterior which enjoys finite-sample excess risk bounds at logarithmic confidence. Our core
Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards
A class of distribution oblivious algorithms are provided with provable upper bounds on the probability of incorrect identification and perform competitively when compared with non-oblivious algorithms, suggesting that distribution obliviousness can be realised in practice without incurring a significant loss of performance.