Corpus ID: 195346143

Piggyback: Adding Multiple Tasks to a Single, Fixed Network by Learning to Mask

@article{Mallya2018PiggybackAM,
  title={Piggyback: Adding Multiple Tasks to a Single, Fixed Network by Learning to Mask},
  author={Arun Mallya and Svetlana Lazebnik},
  journal={ArXiv},
  year={2018},
  volume={abs/1801.06519}
}
This work presents a method for adding multiple tasks to a single, fixed deep neural network without affecting performance on already learned tasks. By building upon concepts from network quantization and sparsification, we learn binary masks that "piggyback", or are applied to an existing network to provide good performance on a new task. These masks are learned in an end-to-end differentiable fashion, and incur a low overhead of 1 bit per network parameter, per task. Even though the… Expand
Adding New Tasks to a Single Network with Weight Trasformations using Binary Masks
TLDR
This work shows that with the generalization of this approach it is possible to achieve significantly higher levels of adaptation to new tasks, enabling the approach to compete with fine tuning strategies by requiring slightly more than 1 bit per network parameter per additional task. Expand
Side-Tuning: Network Adaptation via Additive Side Networks
TLDR
Side-tuning adapts a pre-trained network by training a lightweight "side" network that is fused with the (unchanged) pre- trained network using summation, which works as well as or better than existing solutions while it resolves some of the basic issues with fine- Tuning, fixed features, and several other common baselines. Expand
Boosting binary masks for multi-domain learning through affine transformations
TLDR
This work provides a general formulation of binary mask-based models for multi-domain learning by affine transformations of the original network parameters by solving the challenge of producing a single model performing a task in all the domains together. Expand
Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting
TLDR
By separating the explicit neural structure learning and the parameter estimation, the proposed method is capable of evolving neural structures in an intuitively meaningful way, but also shows strong capabilities of alleviating catastrophic forgetting in experiments. Expand
SpotTune: Transfer Learning Through Adaptive Fine-Tuning
TLDR
In SpotTune, given an image from the target task, a policy network is used to make routing decisions on whether to pass the image through the fine-tuned layers or the pre-trained layers, which outperforms the traditional fine- Tuning approach on 12 out of 14 standard datasets. Expand
Reducing catastrophic forgetting with learning on synthetic data
TLDR
This work answers the question: Is it possible to generate data synthetically which learned in sequence does not result in catastrophic forgetting and proposes a method to generate such data in two-step optimisation process via meta-gradients. Expand
Towards Meta-learning of Deep Architectures for Efficient Domain Adaptation
TLDR
An efficient domain adaption approach using deep learning along with transfer and meta-level learning is proposed, empirically confirming the intuition that there exists a relationship between the similarity of the original and new tasks and the depth of network needed to fine-tune in order to achieve accuracy comparable with that of a model trained from scratch. Expand
Which Tasks Should Be Learned Together in Multi-task Learning?
TLDR
This framework offers a time-accuracy trade-off and can produce better accuracy using less inference time than not only a single large multi-task neural network but also many single-task networks. Expand
Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning
TLDR
DGM relies on conditional generative adversarial networks with learnable connection plasticity realized with neural masking, and a dynamic network expansion mechanism is proposed that ensures sufficient model capacity to accommodate for continually incoming tasks. Expand
Move-to-Data: A new Continual Learning approach with Deep CNNs, Application for image-class recognition
TLDR
This work focuses on the problem of adjusting pre-trained model with new additional training data for existing categories and proposes a fast continual learning layer at the end of the neuronal network. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 43 REFERENCES
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
  • Arun Mallya, S. Lazebnik
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
TLDR
This paper is able to add three fine-grained classification tasks to a single ImageNet-trained VGG-16 network and achieve accuracies close to those of separately trained networks for each task. Expand
Incremental Learning Through Deep Adaptation
TLDR
This work proposes a method called Deep Adaptation Modules (DAM) that constrains newly learned filters to be linear combinations of existing ones, and reduces the parameter cost to around 3 percent of the original with negligible or no loss in accuracy. Expand
Learning without Forgetting
  • Zhizhong Li, Derek Hoiem
  • Computer Science, Mathematics
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2018
TLDR
This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques. Expand
Encoder Based Lifelong Learning
TLDR
A new lifelong learning solution where a single model is trained for a sequence of tasks, aimed at preserving the knowledge of the previous tasks while learning a new one by using autoencoders. Expand
Learning both Weights and Connections for Efficient Neural Network
TLDR
A method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections, and prunes redundant connections using a three-step method. Expand
Dynamic Network Surgery for Efficient DNNs
TLDR
A novel network compression method called dynamic network surgery, which can remarkably reduce the network complexity by making on-the-fly connection pruning by proving that it outperforms the recent pruning method by considerable margins. Expand
UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory
  • I. Kokkinos
  • Computer Science
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
In this work we train in an end-to-end manner a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture. Such a network can act likeExpand
Learning multiple visual domains with residual adapters
TLDR
This paper develops a tunable deep network architecture that, by means of adapter residual modules, can be steered on the fly to diverse visual domains and introduces the Visual Decathlon Challenge, a benchmark that evaluates the ability of representations to capture simultaneously ten very differentVisual domains and measures their ability to recognize well uniformly. Expand
Integrated perception with recurrent multi-task neural networks
TLDR
This work proposes a new architecture, which it calls "MultiNet", in which not only deep image features are shared between tasks, but where tasks can interact in a recurrent manner by encoding the results of their analysis in a common shared representation of the data. Expand
Incremental Learning of Object Detectors without Catastrophic Forgetting
TLDR
This work presents a method to learn object detectors incrementally, when neither the original training data nor annotations for the original classes in the new training set are available, and presents object detection results on the PASCAL VOC 2007 and COCO datasets. Expand
...
1
2
3
4
5
...