Does progress on ImageNet transfer to real-world datasets?

  title={Does progress on ImageNet transfer to real-world datasets?},
  author={Alexander W. Fang and Simon Kornblith and Ludwig Schmidt},
Does progress on ImageNet transfer to real-world datasets? We investigate this question by evaluating ImageNet pre-trained models with varying accuracy (57% - 83%) on six practical image classification datasets. In particular, we study datasets collected with the goal of solving real-world tasks (e.g 



From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

This work uses human studies to investigate the consequences of employing a noisy data collection pipeline and study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset---including the introduction of biases that state-of-the-art models exploit.

Do ImageNet Classifiers Generalize to ImageNet?

The results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.

What makes ImageNet good for transfer learning?

The overall findings suggest that most changes in the choice of pre-training data long thought to be critical do not significantly affect transfer performance.

Do Better ImageNet Models Transfer Better?

It is found that, when networks are used as fixed feature extractors or fine-tuned, there is a strong correlation between ImageNet accuracy and transfer accuracy, and ImageNet features are less general than previously suggested.

Are we done with ImageNet?

A significantly more robust procedure for collecting human annotations of the ImageNet validation set is developed, which finds the original ImageNet labels to no longer be the best predictors of this independently-collected set, indicating that their usefulness in evaluating vision models may be nearing an end.

Is it enough to optimize CNN architectures on ImageNet?

This work investigates and improves ImageNet as a basis for deriving generally effective convolutional neural network architectures that perform well on a diverse set of datasets and application domains and shows how to significantly increase these correlations by utilizing ImageNet subsets restricted to fewer classes.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Transfusion: Understanding Transfer Learning for Medical Imaging

Investigating the learned representations and features finds that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse, and isolate where useful feature reuse occurs, and outline the implications for more efficient model exploration.

Evaluating Machine Accuracy on ImageNet

Overall, the results show that there is still substantial room for improvement on ImageNet and direct accuracy comparisons between humans and machines may overstate machine performance.

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.