Jakub Konecný

Learn More
In this paper we study the problem of minimizing the average of a large number (n) of smooth convex loss functions. We propose a new method, S2GD (Semi-Stochastic Gradient Descent), which runs for one or several epochs in each of which a single full gradient and a random number of stochastic gradients is computed, following a geometric law. The total work(More)
We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors in the control variate, and use this to derive variants of SVRG that use growing-batch strategies to reduce the number(More)
With the growth of data and necessity for distributed optimization methods, solvers that work well on a single machine must be re-designed to leverage distributed computation. Recent work in this area has been limited by focusing heavily on developing highly specific methods for the distributed environment. These special-purpose methods are often unable to(More)
The purpose of this paper is to describe one-shot-learning gesture recognition systems developed on the ChaLearn Gesture Dataset (ChaLearn). We use RGB and depth images and combine appearance (Histograms of Oriented Gradients) and motion descriptors (Histogram of Optical Flow) for parallel temporal segmentation and recognition. The Quadratic-Chi distance(More)
In this paper, we present two new communication-efficient methods for distributed minimization of an average of functions. The first algorithm is an inexact variant of the DANE algorithm [20] that allows any local algorithm to return an approximate solution to a local subproblem. We show that such a strategy does not affect the theoretical guarantees of(More)
We propose mS2GD: a method incorporating a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent (S2GD). We consider the problem of minimizing a strongly convex function represented as the sum of an average of a large number of smooth convex functions, and a simple nonsmooth convex(More)
We propose a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent applied to the problem of minimizing a strongly convex composite function represented as the sum of an average of a large number of smooth convex functions, and simple nonsmooth convex function. Our method first performs a(More)
We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are distributed (unevenly) over an extremely large number of nodes, but the goal remains to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication(More)
We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications,(More)
Primary intestinal tuberculosis is unusual in European and North American countries today. Its diagnosis is often surprising and differentiation from inflammatory bowel diseases is difficult. The authors present a rare case of severe stercoral peritonitis caused by multiple intestinal perforations in a patient with primary ileocecal tuberculosis. Initial(More)