Corpus ID: 19211085

Theory of Convex Optimization for Machine Learning

@article{Bubeck2014TheoryOC,
  title={Theory of Convex Optimization for Machine Learning},
  author={S{\'e}bastien Bubeck},
  journal={ArXiv},
  year={2014},
  volume={abs/1405.4980}
}
This monograph presents the main mathematical ideas in convex optimization. Starting from the fundamental theory of black-box optimization, the material progresses towards recent advances in structural optimization and stochastic optimization. Our presentation of black-box optimization, strongly influenced by the seminal book of Nesterov, includes the analysis of the Ellipsoid Method, as well as (accelerated) gradient descent schemes. We also pay special attention to non-Euclidean settings… Expand
On Global Linear Convergence in Stochastic Nonconvex Optimization for Semidefinite Programming
TLDR
An answer is provided that the stochastic gradient descent method can be adapted to solve the nonconvex reformulation of the original convex problem, with a global linear convergence when using a fixed step size, i.e., converging exponentially fast to the population minimizer within an optimal statistical precision in the restricted strongly convex case. Expand
Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base Polytopes
TLDR
This work considers iterative projections of close-by points over widely-prevalent submodular base polytopes B(f), and develops a toolkit to speed up the computation of projections using both discrete and continuous perspectives. Expand
Stochastic Gradient Descent For Modern Machine Learning: Theory, Algorithms And Applications
TLDR
This thesis considers the behavior of the final iterate of SGD with varying stepsize schemes, including the standard polynomially decaying stepsizes and the practically preferred step decay scheme, with an aim to achieve minimax rates. Expand
Provable non-convex projected gradient descent for a class of constrained matrix optimization problems
TLDR
The Projected Factored Gradient Descent (ProjFGD) algorithm is proposed, that operates on the low-rank factorization of the variable space, and it is shown that the method favors local linear convergence rate in the non-convex factored space, for a class of convex norm-constrained problems. Expand
An accelerated algorithm for delayed distributed convex optimization
TLDR
This thesis provides a framework for distributed delayed convex optimization methods for networks in a master-server setting and proves that a delayed accelerated method maintains the optimality of the algorithm with a convergence rate of O(1/t²). Expand
Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method
TLDR
A novel accelerated first-order method that achieves the asymptotically optimal convergence rate for smooth functions in the first- order oracle model and is motivated by the discretization of an accelerated continuous-time dynamics using the classical method of implicit Euler discretized. Expand
Alternating Randomized Block Coordinate Descent
TLDR
This work introduces a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block, and obtains the first nontrivial accelerated alternating minimization algorithm. Expand
Solving Combinatorial Games using Products, Projections and Lexicographically Optimal Bases
TLDR
A novel primal-style algorithm for computing Bregman projections on the base polytopes of polymatroids and a general recipe to simulate the multiplicative weights update algorithm in time polynomial in their natural dimension are given. Expand
Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances
TLDR
This work shows accelerated linear rates in the $p-Wasserstein metric for any $p\geq 1$ with improved sensitivity to noise for both AG and HB through a non-asymptotic analysis under some additional assumptions on the noise structure. Expand
Multi-stage stochastic gradient method with momentum acceleration
  • Zhijian Luo, Siyu Chen, Yuntao Qian, Yueen Hou
  • Computer Science
  • Signal Process.
  • 2021
TLDR
A multi-stage stochastic gradient descent with momentum acceleration method, named MAGNET, for first-order stochastically convex optimization, which obtains an accelerated rate of convergence, and is adaptive and free from hyper-parameter tuning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 58 REFERENCES
Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization
TLDR
A new general framework for convex optimization over matrix factorizations, where every Frank-Wolfe iteration will consist of a low-rank update, is presented, and the broad application areas of this approach are discussed. Expand
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)
We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework whichExpand
Introductory Lectures on Convex Optimization - A Basic Course
TLDR
It was in the middle of the 1980s, when the seminal paper by Kar markar opened a new epoch in nonlinear optimization, and it became more and more common that the new methods were provided with a complexity analysis, which was considered a better justification of their efficiency than computational experiments. Expand
Parallel coordinate descent methods for big data optimization
In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convexExpand
Lectures on modern convex optimization - analysis, algorithms, and engineering applications
TLDR
The authors present the basic theory of state-of-the-art polynomial time interior point methods for linear, conic quadratic, and semidefinite programming as well as their numerous applications in engineering. Expand
Mirror descent and nonlinear projected subgradient methods for convex optimization
TLDR
It is shown that the MDA can be viewed as a nonlinear projected-subgradient type method, derived from using a general distance-like function instead of the usual Euclidean squared distance, and derived in a simple way convergence and efficiency estimates. Expand
Interior-point polynomial algorithms in convex programming
TLDR
This book describes the first unified theory of polynomial-time interior-point methods, and describes several of the new algorithms described, e.g., the projective method, which have been implemented, tested on "real world" problems, and found to be extremely efficient in practice. Expand
A mathematical view of interior-point methods in convex optimization
  • J. Renegar
  • Mathematics, Computer Science
  • MPS-SIAM series on optimization
  • 2001
TLDR
This compact book will take a reader who knows little of interior-point methods to within sight of the research frontier, developing key ideas that were over a decade in the making by numerous interior- point method researchers. Expand
Sublinear Optimization for Machine Learning
TLDR
Lower bounds are given which show the running times of many of the algorithms to be nearly best possible in the unit-cost RAM model and implementations of these algorithms in the semi-streaming setting, obtaining the first low pass polylogarithmic space and sub linear time algorithms achieving arbitrary approximation factor. Expand
Efficient projections onto the l1-ball for learning in high dimensions
TLDR
Efficient algorithms for projecting a vector onto the l1-ball are described and variants of stochastic gradient projection methods augmented with these efficient projection procedures outperform interior point methods, which are considered state-of-the-art optimization techniques. Expand
...
1
2
3
4
5
...