• Corpus ID: 3290132

Distributed Stochastic Multi-Task Learning with Graph Regularization

@article{Wang2018DistributedSM,
  title={Distributed Stochastic Multi-Task Learning with Graph Regularization},
  author={Weiran Wang and Jialei Wang and Mladen Kolar and Nathan Srebro},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.03830}
}
We propose methods for distributed graph-based multi-task learning that are based on weighted averaging of messages from other machines. Uniform averaging or diminishing stepsize in these methods would yield consensus (single task) learning. We show how simply skewing the averaging weights or controlling the stepsize allows learning different, but related, tasks on the different machines. 

Figures and Tables from this paper

Communication-efficient distributed multi-task learning with matrix sparsity regularization

This work proposes a fast communication-efficient distributed optimization method for solving the problem of multi-task learning with matrix sparsity regularization, and theoretically proves that it enjoys a fast convergence rate for different types of loss functions in the distributed environment.

Multitask Learning Over Graphs: An Approach for Distributed, Streaming Machine Learning

MTL is an approach to inductive transfer learning (using what is learned for one problem to assist with another problem), and it helps improve generalization performance relative to learning each task separately by using the domain information contained in the training signals of related tasks as an inductive bias.

Distributed Linear Model Clustering over Networks: A Tree-Based Fused-Lasso ADMM Approach

This work designs a decentralized generalized alternating direction method of multiplier algorithm for solving the objective function in parallel and derives theoretical properties derived to guarantee both the model consistency and the algorithm convergence.

Distributed Machine Learning with Sparse Heterogeneous Data

This work proposes a method based on Basis Pursuit Denoising with a total variation penalty, and provides finite sample guarantees for sub-Gaussian design matrices and numerically investigates the performance of distributed methods based on Distributed Alternating Direction Methods of Multipliers and hyperspectral unmixing.

Privacy-Preserving Federated Multi-Task Linear Regression: A One-Shot Linear Mixing Approach Inspired By Graph Regularization

This work focuses on the federated multi-task linear regression setting, where each machine possesses its own data for individual tasks and sharing the full local data between machines is prohibited, and proposes a novel fusion framework that only requires a one-shot communication of local estimates.

Multitask learning over graphs

The article shows how cooperation steers the network limiting point and how different cooperation rules allow to promote different task relatedness models and explains how and when cooperation over multitask networks outperforms non-cooperative strategies.

Decentralized Multi-Task Learning Based on Extreme Learning Machines

The ELM based MTL problem is presented in the centralized setting, and the proposed DMTL-ELM algorithm, which is a hybrid Jacobian and Gauss-Seidel Proximal multi-block alternating direction method of multipliers (ADMM), is proposed to solve the problem.

Randomized Neural Networks Based Decentralized Multi-Task Learning via Hybrid Multi-Block ADMM

This work proposes the DMTL-RSF algorithm, which is a hybrid Jacobian and Gauss-Seidel Proximal multi-block alternating direction method of multipliers (ADMM), and demonstrates the convergence of presented algorithms, and also shows that they can outperform existing MTL methods.

Decentralised Sparse Multi-Task Regression.

A decentralised dual method that exploits a convex-concave formulation of the penalised problem is proposed to fit the models and its effectiveness demonstrated on simulations against the group lasso and variants.

Low Sample and Communication Complexities in Decentralized Learning: A Triple Hybrid Approach

This paper proposes a triple hybrid decentralized stochastic gradient descent algorithm for efficiently solving non-convex network-consensus optimization problems for decentralized learning and shows that the TH-DSGD algorithm is stable as the network topology gets sparse and enjoys better convergence in the large-system regime.

References

SHOWING 1-10 OF 34 REFERENCES

Distributed Multi-Task Learning with Shared Representation

We study the problem of distributed multi-task learning with shared representation, where each machine aims to learn a separate, but related, task in an unknown shared low-dimensional subspaces, i.e.

Distributed Multitask Learning

A communication-efficient estimator based on the debiased lasso is presented and it is shown that it is comparable with the optimal centralized method.

Distributed Multi-Task Relationship Learning

This paper proposes a distributed multi-task learning framework that simultaneously learns predictive models for each task as well as task relationships between tasks alternatingly in the parameter server paradigm and proposes a communication-efficient primal-dual distributed optimization algorithm to solve theDual problem by carefully designing local subproblems to make the dual problem decomposable.

Distributed stochastic optimization and learning

  • O. ShamirNathan Srebro
  • Computer Science
    2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2014
It is shown how the best known guarantees are obtained by an accelerated mini-batched SGD approach, and the runtime and sample costs of the approach with those of other distributed optimization algorithms are compared.

Adding vs. Averaging in Distributed Primal-Dual Optimization

A novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization, which allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes with convergence guarantees only allow conservative averaging.

Decentralized Collaborative Learning of Personalized Models over Networks

This paper introduces and analyzes two asynchronous gossip algorithms running in a fully decentralized manner, and aims to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model.

Optimal Distributed Online Prediction Using Mini-Batches

This work presents the distributed mini-batch algorithm, a method of converting many serial gradient-based online prediction algorithms into distributed algorithms that is asymptotically optimal for smooth convex loss functions and stochastic inputs and proves a regret bound for this method.

Convex multi-task feature learning

It is proved that the method for learning sparse representations shared across multiple tasks is equivalent to solving a convex optimization problem for which there is an iterative algorithm which converges to an optimal solution.

Distributed Subgradient Methods for Multi-Agent Optimization

The authors' convergence rate results explicitly characterize the tradeoff between a desired accuracy of the generated approximate optimal solutions and the number of iterations needed to achieve the accuracy.

Learning Multiple Tasks with Kernel Methods

The experiments show that learning multiple related tasks simultaneously using the proposed approach can significantly outperform standard single-task learning particularly when there are many related tasks but few data per task.