AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
@article{Sun2019AdaShareLW, title={AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning}, author={Ximeng Sun and Rameswar Panda and Rog{\'e}rio Schmidt Feris}, journal={ArXiv}, year={2019}, volume={abs/1911.12423} }
Multi-task learning is an open and challenging problem in computer vision. The typical way of conducting multi-task learning with deep neural networks is either through handcrafting schemes that share all initial layers and branch out at an adhoc point or through using separate task-specific networks with an additional feature sharing/fusion mechanism. Unlike existing methods, we propose an adaptive sharing approach, called AdaShare, that decides what to share across which tasks for achieving…
Figures and Tables from this paper
100 Citations
Multi-Task Learning with Deep Neural Networks: A Survey
- Computer ScienceArXiv
- 2020
An overview of multi-task learning methods for deep neural networks is given, with the aim of summarizing both the well-established and most recent directions within the field.
Learning to Branch for Multi-Task Learning
- Computer ScienceICML
- 2020
This work proposes a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure that enables differentiable network splitting that is end-to-end trainable.
Shapeshifter Networks: Cross-layer Parameter Sharing for Scalable and Effective Deep Learning
- Computer ScienceArXiv
- 2020
SSNs address the observation that many neural networks are severely overparameterized, resulting in significant waste in computational resources as well as being susceptible to overfitting by learning where and how to share parameters between layers in a neural network while avoiding degenerate solutions that result in underfitting.
Multi-path Neural Networks for On-device Multi-domain Visual Classification
- Computer Science2021 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2021
This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space.
Which Tasks Should Be Learned Together in Multi-task Learning?
- Computer ScienceICML
- 2020
This framework offers a time-accuracy trade-off and can produce better accuracy using less inference time than not only a single large multi-task neural network but also many single-task networks.
Measuring and Harnessing Transference in Multi-Task Learning
- Computer ScienceArXiv
- 2020
This paper develops a similarity measure that can quantify transference among tasks and uses this quantity to both better understand the optimization dynamics of multi-task learning as well as improve overall learning performance.
Latent Domain Learning with Dynamic Residual Adapters
- Computer ScienceArXiv
- 2020
This work proposes dynamic residual adapters, an adaptive gating mechanism that helps account for latent domains, coupled with an augmentation strategy inspired by recent style transfer techniques that significantly outperform off-the-shelf networks with much larger capacity, and can be incorporated seamlessly with existing architectures in an end-to-end manner.
Safe Multi-Task Learning
- Computer ScienceArXiv
- 2021
A Safe Multi-Task Learning (SMTL) model is proposed, which consists of a public encoder shared by all the tasks, private encoders, gates, and private decoder, and a lite version of SMTL is proposed to reduce the storage cost during the inference stage.
Multi-Task Meta Learning: learn how to adapt to unseen tasks
- Computer ScienceArXiv
- 2022
This work proposes Multi-task Meta Learning, integrating two learning paradigms Multi-Task Learning ( MTL ) and meta learning, to bring together the best of both worlds and achieves state-of-the-art results for most of the tasks.
Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search
- Computer ScienceArXiv
- 2022
This framework, named EDNAS, is the first to successfully leverage the synergistic relationship of NAS and MTL for DP, and proposes JAReD, an improved, easy-to-adopt Joint Absolute-Relative Depth loss, that reduces up to 88% of the undesired noise while simultaneously boosting accuracy.
References
SHOWING 1-10 OF 74 REFERENCES
End-To-End Multi-Task Learning With Attention
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
The proposed Multi-Task Attention Network (MTAN) consists of a single shared network containing a global feature pool, together with a soft-attention module for each task, which allows learning of task-specific feature-level attention.
Many Task Learning With Task Routing
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This paper introduces Many Task Learning (MaTL) as a special case of MTL where more than 20 tasks are performed by a single model and applies a conditional feature-wise transformation over the convolutional activations that enables a model to successfully perform a large number of tasks.
Branched Multi-Task Networks: Deciding what layers to share
- Computer ScienceBMVC
- 2020
This paper proposes an approach to automatically construct branched multi-task networks, by leveraging the employed tasks' affinities, given a specific budget, and generates architectures, in which shallow layers are task-agnostic, whereas deeper ones gradually grow more task-specific.
Deep Elastic Networks With Model Selection for Multi-Task Learning
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes an efficient approach to exploit a compact but accurate model in a backbone architecture for each instance of all tasks to perform instance-wise dynamic network model selection for multi-task learning.
SpotTune: Transfer Learning Through Adaptive Fine-Tuning
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
In SpotTune, given an image from the target task, a policy network is used to make routing decisions on whether to pass the image through the fine-tuned layers or the pre-trained layers, which outperforms the traditional fine- Tuning approach on 12 out of 14 standard datasets.
Cross-Stitch Networks for Multi-task Learning
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This paper proposes a principled approach to learn shared representations in Convolutional Networks using multitask learning using a new sharing unit: "cross-stitch" unit that combines the activations from multiple networks and can be trained end-to-end.
Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This paper proposes "stochastic filter groups" (SFG), a mechanism to assign convolution kernels in each layer to "specialist" and "generalist" groups, which are specific to and shared across different tasks, respectively.
Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
Evaluation on person attributes classification tasks involving facial and clothing attributes suggests that the models produced by the proposed method are fast, compact and can closely match or exceed the state-of-the-art accuracy from strong baselines by much more expensive models.
Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
A principled approach to multi-task deep learning is proposed which weighs multiple loss functions by considering the homoscedastic uncertainty of each task, allowing us to simultaneously learn various quantities with different units or scales in both classification and regression settings.
Integrated perception with recurrent multi-task neural networks
- Computer ScienceNIPS
- 2016
This work proposes a new architecture, which it calls "MultiNet", in which not only deep image features are shared between tasks, but where tasks can interact in a recurrent manner by encoding the results of their analysis in a common shared representation of the data.