• Corpus ID: 119153196

Tail bounds for stochastic approximation

  title={Tail bounds for stochastic approximation},
  author={Michael P. Friedlander and Gabriel Goh},
  journal={arXiv: Optimization and Control},
Stochastic-approximation gradient methods are attractive for large-scale convex optimization because they offer inexpensive iterations. They are especially popular in data-fitting and machine-learning applications where the data arrives in a continuous stream, or it is necessary to minimize large sums of functions. It is known that by appropriately decreasing the variance of the error at each iteration, the expected rate of convergence matches that of the underlying deterministic gradient… 

Figures from this paper

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

This work proposes and analyzes a new proximal stochastic gradient method, which uses a multistage scheme to progressively reduce the variance of the stochastics gradient.

Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values

A novel class of stochastic, adaptive methods for minimizing self-concordant functions which can be expressed as an expected value is proposed, which includes extensions of gradient descent and BFGS.

Extragradient Method with Variance Reduction for Stochastic Variational Inequalities

We propose an extragradient method with stepsizes bounded away from zero for stochastic variational inequalities requiring only pseudomonotonicity. We provide convergence and complexity analysis, a...

A Framework for Analyzing Stochastic Optimization Algorithms Under Dependence

This is the first work that analyzes a fully stochastic BFGS algorithm, which also avoids time consuming or even impossible line-search steps, and is proved that it converges linearly globally and super-linearly locally.

Parallelizing sparse recovery algorithms: A stochastic approach

  • A. ShahA. Majumdar
  • Computer Science
    2014 19th International Conference on Digital Signal Processing
  • 2014
This work proposes a novel technique for accelerating sparse recovery algorithms on multi-core shared memory architectures based on the principles of stochastic gradient descent that is as accurate as the sequential version but is significantly faster - the larger the size of the problem, the faster is the method.

Accelerating low-rank matrix completion on GPUs

  • A. ShahA. Majumdar
  • Computer Science
    2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
  • 2014
This work modify and parallelize a well known matrix completion algorithm so that it can be implemented on a GPU and speed-up is significant and improves as the size of the dataset increases; there is no change in accuracy between the sequential and the proposed parallel implementation.



Approximation accuracy, gradient methods, and error bound for structured convex optimization

  • P. Tseng
  • Computer Science
    Math. Program.
  • 2010
An error bound for the linear convergence analysis of first-order gradient methods for solving convex optimization problems arising in applications, possibly as approximations of intractable problems.

Hybrid Deterministic-Stochastic Methods for Data Fitting

Rate-of-convergence analysis shows that by controlling the sample size in an incremental gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods.

Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization

This work shows that both the basic proximal-gradient method and the accelerated proximal - gradient method achieve the same convergence rate as in the error-free case, provided that the errors decrease at appropriate rates.

A Stochastic Approximation Method

Let M(x) denote the expected value at level x of the response to a certain experiment. M(x) is assumed to be a monotone function of x but is unknown tot he experiment, and it is desire to find the

Convergence Rate of Incremental Subgradient Algorithms

An incremental approach to minimizing a convex function that consists of the sum of a large number of component functions is considered, which has been very successful in solving large differentiable least squares problems, such as those arising in the training of neural networks.

Robust Stochastic Approximation Approach to Stochastic Programming

It is intended to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems.

Sparse Online Learning via Truncated Gradient

This work proposes a general method called truncated gradient to induce sparsity in the weights of online-learning algorithms with convex loss and finds for datasets with large numbers of features, substantial sparsity is discoverable.

Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero

  • M. Solodov
  • Computer Science
    Comput. Optim. Appl.
  • 1998
The first convergence results of any kind for this computationally important case are derived and it is shown that a certain ε-approximate solution can be obtained and the linear dependence of ε on the stepsize limit is established.

Gradient methods for minimizing composite objective function

In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two convex terms: one is smooth and given by a black-box oracle, and

A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.