#### Filter Results:

- Full text PDF available (64)

#### Publication Year

2005

2017

- This year (5)
- Last 5 years (57)
- Last 10 years (66)

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

- Peter Richtárik, Martin Takác
- Math. Program.
- 2014

In this paper we develop a randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function and prove that it obtains an ε-accurate solution with probability at least 1 − ρ in at most O((n/ε) log(1/ρ)) iterations, where n is the number of blocks. This extends recent results of Nesterov… (More)

- Michel Journée, Yurii Nesterov, Peter Richtárik, Rodolphe Sepulchre
- Journal of Machine Learning Research
- 2010

In this paper we develop a new approach to sparse principal component analysis (sparse PCA). We propose two single-unit and two block optimization formulations of the sparse PCA problem, aimed at extracting a single sparse dominant principal component of a data matrix, or more components at once, respectively. While the initial formulations involve… (More)

- Olivier Fercoq, Peter Richtárik
- ArXiv
- 2013

We study the performance of a family of randomized parallel coordinate descent methods for minimizing the sum of a nonsmooth and separable convex functions. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss (" AdaBoost problem "). We assume the input data defining the loss function is… (More)

- Peter Richtárik, Martin Takác
- Math. Program.
- 2016

In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex function and a simple separable convex function. The theoretical speedup, as compared to the serial method, and referring to the number of iterations needed to… (More)

Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper , we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for… (More)

- Olivier Fercoq, Peter Richtárik
- SIAM Journal on Optimization
- 2015

We propose a new stochastic coordinate descent method for minimizing the sum of convex functions each of which depends on a small number of coordinates only. Our method (APPROX) is simultaneously Accelerated, Parallel and PROXimal; this is the first time such a method is proposed. In the special case when the number of processors is equal to the number of… (More)

- Peter Richtárik, Martin Takác
- ArXiv
- 2013

In this paper we develop and analyze Hydra: HYbriD cooRdinAte descent method for solving loss minimization problems with big data. We initially partition the coordinates (features) and assign each partition to a different node of a cluster. At every iteration, each node picks a random subset of the coordinates from those it owns, independently from the… (More)

- Jakub Konecný, Peter Richtárik
- Front. Appl. Math. Stat.
- 2017

In this paper we study the problem of minimizing the average of a large number (n) of smooth convex loss functions. We propose a new method, S2GD (Semi-Stochastic Gradient Descent), which runs for one or several epochs in each of which a single full gradient and a random number of stochastic gradients is computed, following a geometric law. The total work… (More)

We address the issue of using mini-batches in stochastic optimization of SVMs. We show that the same quantity, the spectral norm of the data, controls the parallelization speedup obtained for both primal stochastic subgradi-ent descent (SGD) and stochastic dual coordinate ascent (SCDA) methods and use it to derive novel variants of mini-batched SDCA. Our… (More)

- Peter Richtárik
- SIAM Journal on Optimization
- 2011

In this paper we propose two modifications to Nesterov's algorithms for minimizing convex functions in relative scale. The first is based on a bisection technique and leads to improved theoretical iteration complexity and the second is a heuristic for avoiding restarting behavior. The fastest of our algorithms produces a solution within relative error… (More)