TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning

  title={TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning},
  author={Bingyan Liu and Yifeng Cai and Yao Guo and Xiangqun Chen},
  booktitle={AAAI Conference on Artificial Intelligence},
The increasing of pre-trained models has significantly facilitated the performance on limited data tasks with transfer learning. However, progress on transfer learning mainly focuses on optimizing the weights of pre-trained models, which ignores the structure mismatch between the model and the target task. This paper aims to improve the transfer performance from another angle - in addition to tuning the weights, we tune the structure of pre-trained models, in order to better match the target… 

Figures and Tables from this paper

PFA: Privacy-preserving Federated Adaptation for Effective Model Personalization

PFA leverages the sparsity property of neural networks to generate privacy-preserving representations and uses them to efficiently identify clients with similar data distributions that can cooperate with each other during federated adaptation.

A GPU-accelerated Algorithm for Distinct Discriminant Canonical Correlation Network

A GPU- based accelerated algorithm is proposed to further optimize the DDCCANet algorithm, which greatly shortens the calculation time, making the model more applicable in real tasks.

Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to Implementation

It is shown that it is possible to design an NN-based equalizer that is simpler to implement and has better perfor- mance than the conventional digital back-propagation (DBP) equalizer with only one step per span.

Target Aware Network Architecture Search and Compression for Efficient Knowledge Transfer

The proposed TASCNet reduces the computational complexity of pre-trained CNNs over the target task by reducing both trainable parameters and FLOPs which enables resource-efficient knowledge transfer.

ReMoS: Reducing Defect Inheritance in Transfer Learning via Relevant Model Slicing

ReMoS is proposed, a relevant model slicing technique to reduce defect inheri-tance during transfer learning while retaining useful knowledge from the teacher model that computes a model slice that is relevant to the student task based on the neuron coverage information obtained by profiling the teachermodel on the studenttask.


This work deploys a convolutional neural network (CNN) on resource-constraint IoT devices to make them intelligent and realistic and proposes decentralized heterogeneous edge clusters deployed with an optimized pre-trained yolov2 model.

ODMTCNet: An Interpretable Multi-view Deep Neural Network Architecture for Image Feature Representation

This work demonstrates that, in ODMTCNet, the relation between the optimal performance and parameters can be predicted, with each layer generating justified knowledge representations, leading to an interpretable multi-view based convolutional network.

DistFL: Distribution-aware Federated Learning for Mobile Scenarios

This paper presents the results of a two-month study conducted by researchers at the University of Illinois at Urbana-Champaign and the Massachusetts Institute of Technology (MIT) studying the response of the immune system to earthquake-triggered diarrhoea.



DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks

A novel regularized transfer learning framework DELTA, namely DEep Learning Transfer using Feature Map with Attention that intends to align the outer layer outputs of two networks, through constraining a subset of feature maps that are precisely selected by attention that has been learned in an supervised learning manner.

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

The proposed Soft Filter Pruning (SFP) method enables the pruned filters to be updated when training the model after pruning, which has two advantages over previous works: larger model capacity and less dependence on the pretrained model.

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

This paper investigates several regularization schemes that explicitly promote the similarity of the final solution with the initial model, and eventually recommends a simple $L^2$ penalty with the pre-trained model being a reference as the baseline of penalty for transfer learning tasks.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs

A 120 class Stanford Dogs dataset, a challenging and large-scale dataset aimed at fine-grained image categorization, is introduced, which includes over 22,000 annotated images of dogs belonging to 120 species.

Recognizing indoor scenes

A prototype based model that can successfully combine local and global discriminative information is proposed that can significantly outperform a state of the art classifier for the indoor scene recognition task.

Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks

This work proposes a global filter pruning algorithm called Gate Decorator, which transforms a vanilla CNN module by multiplying its output by the channel-wise scaling factors (i.e. gate), and proposes an iterative pruning framework called Tick-Tock to improve pruning accuracy.

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.

How transferable are features in deep neural networks?

This paper quantifies the generality versus specificity of neurons in each layer of a deep convolutional neural network and reports a few surprising results, including that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

ImageNet: A large-scale hierarchical image database

A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.