Transferability Estimation using Bhattacharyya Class Separability

@article{Pandy2022TransferabilityEU,
  title={Transferability Estimation using Bhattacharyya Class Separability},
  author={Michal P'andy and A. Agostinelli and Jasper R. R. Uijlings and Vittorio Ferrari and Thomas Mensink},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022},
  pages={9162-9172}
}
Transfer learning has become a popular method for leveraging pre-trained models in computer vision. However, without performing computationally expensive fine-tuning, it is difficult to quantify which pre-trained source models are suitable for a specific target task, or, conversely, to which tasks a pre-trained source model can be easily adapted to. In this work, we propose Gaussian Bhattacharyya Coefficient (GBC), a novel method for quantifying transferability between a source model and a… 

On the pitfalls of entropy-based uncertainty for multi-class semi-supervised segmentation

TLDR
It is argued, indeed, that these approaches underperform due to the erroneous uncertainty approximations in the presence of inter-class overlap, and an alternative solution to compute the uncertainty in a multi-class setting is proposed, based on divergence distances and which account for inter- class overlap.

Improving the Generalization of Supervised Models

TLDR
This paper enrichs the common supervised training framework using two key components of recent SSL models: multi-scale crops for data augmentation and the use of an expendable projector head, and replaces the last layer of class weights with class prototypes computed on the fly using a memory bank.

How stable are Transferability Metrics evaluations?

TLDR
It is discovered that even small variations to an experimental setup lead to different conclusions about the superiority of a transferability metric over another, and better evaluations are proposed by aggregating across many experiments, enabling to reach more stable conclusions.

Assessing the Value of Transfer Learning Metrics for RF Domain Adaptation

TLDR
This work begins this examination by evaluating the how radio frequency (RF) domain changes encourage or prevent the transfer of features learned by convolutional neural network (CNN)-based automatic modulation classifiers.

Selective Cross-Task Distillation

—The outpouring of various pre-trained models empowers knowledge distillation by providing abundant teacher resources, but there lacks a developed mechanism to utilize these teachers adequately. With

References

SHOWING 1-10 OF 74 REFERENCES

Geometric Dataset Distances via Optimal Transport

TLDR
This work proposes an alternative notion of distance between datasets that is model-agnostic, does not involve training, can compare datasets even if their label sets are completely disjoint and has solid theoretical footing.

Transferability and Hardness of Supervised Classification Tasks

TLDR
This work proposes a novel approach for estimating the difficulty and transferability of supervised classification tasks using an information theoretic approach, treating training labels as random variables and exploring their statistics, and provides results showing that these hardness andTransferability estimates are strongly correlated with empirical hardness andtransferability.

LEEP: A New Measure to Evaluate Transferability of Learned Representations

TLDR
This work introduces a new measure to evaluate the transferability of representations learned by classifiers, the Log Expected Empirical Prediction (LEEP), which can predict the performance and convergence speed of both transfer and meta-transfer learning methods, even for small or imbalanced data.

MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation

TLDR
This work adopts zero-shot cross-dataset transfer as a benchmark to systematically evaluate a model’s robustness and shows that MSeg training yields substantially more robust models in comparison to training on individual datasets or naive mixing of datasets without the presented contributions.

Frustratingly Easy Domain Adaptation

We describe an approach to domain adaptation that is appropriate exactly in the case when one has enough “target” data to do slightly better than just using only “source” data. Our approach is

Stacked Hourglass Networks for Human Pose Estimation

TLDR
This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.

MobileNetV2: Inverted Residuals and Linear Bottlenecks

TLDR
A new mobile architecture, MobileNetV2, is described that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes and allows decoupling of the input/output domains from the expressiveness of the transformation.

Factors of Transferability for a Generic ConvNet Representation

TLDR
This paper introduces and investigates several factors affecting the transferability of Convolutional Networks, and shows that significant improvements can be achieved on various visual recognition tasks.

Best Practices for Fine-Tuning Visual Classifiers to New Domains

TLDR
It is concluded, with a few exceptions, that it is best to copy as many layers of a pre-trained network as possible, and then adjust the level of fine-tuning based on the visual distance from source.

Identity Mappings in Deep Residual Networks

TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation.
...