LibMTL: A Python Library for Multi-Task Learning

  title={LibMTL: A Python Library for Multi-Task Learning},
  author={Baijiong Lin and Yu Zhang},
This paper presents LibMTL , an open-source Python library built on PyTorch , which provides a unified, comprehensive, reproducible, and extensible implementation framework for Multi-Task Learning (MTL). LibMTL considers different settings and approaches in MTL, and it supports a large number of state-of-the-art MTL methods, including 12 loss weighting strategies, 7 architectures, and 84 combinations of different architectures and loss weighting methods. Moreover, the modular design in LibMTL… 

Figures from this paper

Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning

Multi-Task Learning (MTL) has achieved success in various fields. However, training with equal weights for all tasks may cause unsatisfactory performance for part of tasks. To address this problem,

Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Stochastic Approach

A stochastic M ulti-objective gradient C orrection ( MoCo) method for multi- objective optimization that can guarantee convergence without increasing the batch size even in the nonconvex setting and can outperforms state-of-the-art MTL algorithms in challenging MTL benchmarks.



MTLV: a library for building deep multi-task learning architectures

This work investigates the performance of BERT-family models in different MTL settings with Open-I (radiology reports) and OHSUMED (PubMed abstracts) datasets, and introduces the MTLV (Multi-Task Learning Visualizer) library for building Multi-task learning-related architectures which use existing infrastructure.

Towards Impartial Multi-task Learning

This paper proposes an impartial multi-task learning (IMTL) that can be end-to-end trained without any heuristic hyper-parameter tuning, and is general to be applied on all kinds of losses without any distribution assumption.

End-To-End Multi-Task Learning With Attention

The proposed Multi-Task Attention Network (MTAN) consists of a single shared network containing a global feature pool, together with a soft-attention module for each task, which allows learning of task-specific feature-level attention.

RMTL: an R library for multi-task learning

An efficient, easy-to-use R library for MTL comprising 10 algorithms applicable for regression, classification, joint predictor selection, task clustering, low-rank learning and incorporation of biological networks is developed.

PyTorch: An Imperative Style, High-Performance Deep Learning Library

This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.

A Survey on Multi-Task Learning

A survey for MTL from the perspective of algorithmic modeling, applications and theoretical analyses, which gives a definition of MTL and classify different MTL algorithms into five categories, including feature learning approach, low-rank approach, task clustering approach,task relation learning approach and decomposition approach.

A Closer Look at Loss Weighting in Multi-Task Learning

It is surprisingly found that training a MTL model with random weights sampled from a distribution can achieve comparable performance over state-of-the-art baselines and is proposed as Random Loss Weighting (RLW), which can be implemented in only one additional line of code over existing works.

Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations

A Progressive Layered Extraction model with a novel sharing structure design, which outperforms state-of-the-art MTL models significantly under different task correlations and task-group size, is proposed and deployed to the online video recommender system in Tencent successfully.

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

DSelect-k is developed: the first, continuously differentiable and sparse gate for MoE, based on a novel binary encoding formulation, that can be trained using first-order methods, such as stochastic gradient descent, and offers explicit control over the number of experts to select.

GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks

A gradient normalization (GradNorm) algorithm that automatically balances training in deep multitask models by dynamically tuning gradient magnitudes is presented, showing that for various network architectures, for both regression and classification tasks, and on both synthetic and real datasets, GradNorm improves accuracy and reduces overfitting across multiple tasks.