Skip to search formSkip to main contentSemantic Scholar

You are currently offline. Some features of the site may not work correctly.

Semantic Scholar uses AI to extract papers important to this topic.

Highly Cited

2014

Highly Cited

2014

We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component… Expand

Is this relevant?

Highly Cited

2014

Highly Cited

2014

In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently… Expand

Is this relevant?

Highly Cited

2014

Highly Cited

2014

We derive a second-order ordinary differential equation (ODE), which is the limit of Nesterov's accelerated gradient method. This… Expand

Is this relevant?

Highly Cited

2013

Highly Cited

2013

In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two… Expand

Is this relevant?

Highly Cited

2012

Highly Cited

2012

We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly… Expand

Is this relevant?

Highly Cited

2011

Highly Cited

2011

We consider the problem of optimizing the sum of a smooth convex function and a non-smooth convex function using proximal… Expand

Is this relevant?

Highly Cited

2009

Highly Cited

2009

We consider the minimization of a smooth loss function regularized by the trace norm of the matrix variable. Such formulation… Expand

Is this relevant?

Highly Cited

2007

Highly Cited

2007

In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two… Expand

Is this relevant?

Highly Cited

2001

Highly Cited

2001

We provide a natural gradient method that represents the steepest descent direction based on the underlying structure of the… Expand

Is this relevant?

Highly Cited

2000

Highly Cited

2000

We consider the gradient method $x_{t+1}=x_t+\g_t(s_t+w_t)$, where $s_t$ is a descent direction of a function $f:\rn\to\re$ and… Expand

Is this relevant?