• Corpus ID: 239998731

Conflict-Averse Gradient Descent for Multi-task Learning

  title={Conflict-Averse Gradient Descent for Multi-task Learning},
  author={Bo Liu and Xingchao Liu and Xiaojie Jin and Peter Stone and Qiang Liu},
The goal of multi-task learning is to enable more efficient learning than single task learning by sharing model structures for a diverse set of tasks. A standard multi-task learning objective is to minimize the average loss across all tasks. While straightforward, using this objective often results in much worse final performance for each task than learning them independently. A major challenge in optimizing a multi-task model is the conflicting gradients, where gradients of different task… 


Multi-Task Learning as Multi-Objective Optimization
This paper proposes an upper bound for the multi-objective loss and shows that it can be optimized efficiently, and proves that optimizing this upper bound yields a Pareto optimal solution under realistic assumptions.
Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization
This work develops the first gradient-based multi-objective MTL algorithm that combines multiple gradient descent with carefully controlled ascent to traverse the Pareto front in a principled manner, which also makes it robust to initialization.
Distral: Robust multitask reinforcement learning
This work proposes a new approach for joint training of multiple tasks, which it refers to as Distral (Distill & transfer learning), and shows that the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.
Towards Impartial Multi-task Learning
This paper proposes an impartial multi-task learning (IMTL) that can be end-to-end trained without any heuristic hyper-parameter tuning, and is general to be applied on all kinds of losses without any distribution assumption.
Dynamic Task Prioritization for Multitask Learning
This work proposes a notion of dynamic task prioritization to automatically prioritize more difficult tasks by adaptively adjusting the mixing weight of each task’s loss objective and outperforms existing multitask methods and demonstrates competitive results with modern single-task models on the COCO and MPII datasets.
Attentive Single-Tasking of Multiple Tasks
In this work we address task interference in universal networks by considering that a network is trained on multiple tasks, but performs one task at a time, an approach we refer to as
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
A gradient normalization (GradNorm) algorithm that automatically balances training in deep multitask models by dynamically tuning gradient magnitudes is presented, showing that for various network architectures, for both regression and classification tasks, and on both synthetic and real datasets, GradNorm improves accuracy and reduces overfitting across multiple tasks.
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
This paper proposes soft actor-critic, an off-policy actor-Critic deep RL algorithm based on the maximum entropy reinforcement learning framework, and achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off- policy methods.
A Survey on Multi-Task Learning
  • Yu Zhang, Qiang Yang
  • Computer Science, Mathematics
  • 2017
A survey for MTL is given, which classifies different MTL algorithms into several categories, including feature learning approach, low-rank approach, task clustering approaches, task relation learning approaches, and decomposition approach, and then discusses the characteristics of each approach.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.