• Corpus ID: 11379717

# Second-Order Kernel Online Convex Optimization with Adaptive Sketching

@inproceedings{Calandriello2017SecondOrderKO,
title={Second-Order Kernel Online Convex Optimization with Adaptive Sketching},
author={Daniele Calandriello and Alessandro Lazaric and Michal Valko},
booktitle={ICML},
year={2017}
}
• Published in ICML 15 June 2017
• Computer Science
Kernel online convex optimization (KOCO) is a framework combining the expressiveness of non-parametric kernel models with the regret guarantees of online learning. First-order KOCO methods such as functional gradient descent require only $O(t)$ time and space per iteration, and, when the only information on the losses is their convexity, achieve a minimax optimal $O(\sqrt{T})$ regret. Nonetheless, many common losses in kernel problems, such as squared loss, logistic loss, and squared hinge loss…

### Efficient Second-Order Online Kernel Learning with Adaptive Embedding

• Computer Science
NIPS
• 2017
This paper proposes PROS-N-KONS, a method that combines Nystrom sketching to project the input point in a small, accurate embedded space, and performs efficient second-order updates in this space and achieves the logarithmic regret.

### Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity

• Computer Science
ICML
• 2018
Meta-Frank-Wolfe is proposed, the first online projection-free algorithm that uses stochastic gradient estimates and a novel "lifting" framework for the online discrete submodular maximization is developed, which outperform current state-of-the-art techniques on various experiments.

### Dynamic Regret for Strongly Adaptive Methods and Optimality of Online KRR

• Computer Science
ArXiv
• 2021
It is demonstrated that Strongly Adaptive (SA) algorithms can be viewed as a principled way of controlling dynamic regret in terms of path variation VT of the comparator sequence without apriori knowledge of VT.

### Efficient online learning with kernels for adversarial large scale problems

• Computer Science
NeurIPS
• 2019
The resulting algorithm is based on approximations of the Gaussian kernel through Taylor expansion that achieves for d-dimensional inputs a (close to) optimal regret of order $O((\log n)^{d+1})$ with per-round time complexity and space complexity.

### Dynamic Online Learning via Frank-Wolfe Algorithm

• Computer Science
IEEE Transactions on Signal Processing
• 2021
This work proposes to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible, and establishes performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot.

### Projection Free Dynamic Online Learning

• Computer Science
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
• 2020
A projection-free scheme based on Frank-Wolfe is proposed, where instead of online gradient steps, the algorithm’s required information is relaxed to only noisy gradient estimates, i.e., partial feedback and the dynamic regret bounds are derived.

### Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret

• Computer Science
COLT
• 2019
BKB (budgeted kernelized bandit), a new approximate GP algorithm for optimization under bandit feedback that achieves near-optimal regret (and hence near-Optimal convergence rate) with near-constant per-iteration complexity and remarkably no assumption on the input space or covariance of the GP.

### Eﬀicient online learning with kernels for adversarial large scale problems

• Computer Science
• 2022
The algorithm is studied to achieve the optimal regret for a wide range of kernels with a per-round complexity of order n α with α < 2 and improves the computational trade-off known for online kernel regression.

### LEARNING WITH K ERNELS FOR ADVERSARIAL LARGE SCALE PROBLEMS

• Computer Science
• 2019
The algorithm is studied is the first to achieve the optimal regret for a wide range of kernels with a per-round complexity of order n with α < 2, and improves the computational trade-off known for online kernel regression.

### Sparse Representations of Positive Functions via First- and Second-Order Pseudo-Mirror Descent

• Computer Science
IEEE Transactions on Signal Processing
• 2022
First and second-order variants of stochastic mirror descent employing pseudo-gradients and complexity-reducing projections are developed, which establish tradeoffs between the radius of convergence of the expected sub-optimality and the projection budget parameter, as well as non-asymptotic bounds on the model complexity.

## References

SHOWING 1-10 OF 31 REFERENCES

### Logarithmic regret algorithms for online convex optimization

• Computer Science
Machine Learning
• 2007
Several algorithms achieving logarithmic regret are proposed, which besides being more general are also much more efficient to implement, and give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field.

### Fast Randomized Kernel Methods With Statistical Guarantees

• Computer Science
ArXiv
• 2014
A version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance is described, and a new notion of the statistical leverage of a data point captures in a fine way the difficulty of the original statistical learning problem.

### Dual Space Gradient Descent for Online Learning

• Computer Science
NIPS
• 2016
The Dual Space Gradient Descent (DualSGD) is presented, a novel framework that utilizes random features as an auxiliary space to maintain information from data points removed during budget maintenance while simultaneously mitigating the impact of the dimensionality issue on learning performance.

### Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training

• Computer Science
J. Mach. Learn. Res.
• 2012
Comprehensive empirical results show that BSGD achieves higher accuracy than the state-of-the-art budgeted online algorithms and comparable to non-budget algorithms, while achieving impressive computational efficiency both in time and space during training and prediction.

### Online learning with kernels

• Computer Science
IEEE Transactions on Signal Processing
• 2004
This paper considers online learning in a reproducing kernel Hilbert space, and allows the exploitation of the kernel trick in an online setting, and examines the value of large margins for classification in the online setting with a drifting target.

### Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

• Computer Science
ICML
• 2010
This work analyzes GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design and obtaining explicit sublinear regret bounds for many commonly used covariance functions.

### Large Scale Online Kernel Learning

• Computer Science
J. Mach. Learn. Res.
• 2016
A new framework for large scale online kernel learning, making kernel methods efficient and scalable for large-scale online learning applications, and presents two different online kernel machine learning algorithms that apply the random Fourier features for approximating kernel functions.

• Computer Science
J. Mach. Learn. Res.
• 2011
This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.

### Adam: A Method for Stochastic Optimization

• Computer Science
ICLR
• 2015
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.