Corpus ID: 220280805

Measuring Robustness to Natural Distribution Shifts in Image Classification

@article{Taori2020MeasuringRT,
  title={Measuring Robustness to Natural Distribution Shifts in Image Classification},
  author={Rohan Taori and Achal Dave and Vaishaal Shankar and Nicholas Carlini and Benjamin Recht and Ludwig Schmidt},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.00644}
}
We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets. Most research on robustness focuses on synthetic image perturbations (noise, simulated weather artifacts, adversarial examples, etc.), which leaves open how robustness on synthetic distribution shift relates to distribution shift arising in real data. Informed by an evaluation of 204 ImageNet models in 213 different test conditions, we find that there is often little to no… Expand

Figures from this paper

Contemplating real-world object classification
TLDR
The results indicate that limiting the object area as much as possible leads to consistent improvement in accuracy and robustness, and show that ObjecNet is still a challenging test platform for evaluating the generalization ability of models. Expand
Learning Transferable Visual Models From Natural Language Supervision
TLDR
It is demonstrated that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. Expand
Measuring Robustness in Deep Learning Based Compressive Sensing
TLDR
The robustness of different approaches for image reconstruction including trained and un-trained neural networks as well as traditional sparsity-based methods is measured and it is found that both trained andun-trained methods are vulnerable to adversarial perturbations. Expand
Optimism in the Face of Adversity: Understanding and Improving Deep Learning Through Adversarial Robustness
TLDR
The goal of this article is to provide readers with a set of new perspectives to understand deep learning and supply them with intuitive tools and insights on how to use adversarial robustness to improve it. Expand
Robust fine-tuning of zero-shot models
TLDR
This work introduces a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models, and finds that the resulting weight-space ensembles provide large accuracy improvements out-of-dist distribution, while matching or improving in-distribution accuracy. Expand
Underspecification Presents Challenges for Credibility in Modern Machine Learning
TLDR
This work shows the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain, and shows that this problem appears in a wide variety of practical ML pipelines. Expand
Contrasting Contrastive Self-Supervised Representation Learning Models
TLDR
This paper analyzes contrastive approaches as one of the most successful and popular variants of self-supervised representation learning and examines over 700 training experiments including 30 encoders, 4 pre-training datasets and 20 diverse downstream tasks. Expand
Predicting with Confidence on Unseen Distributions
TLDR
This work investigates the distinction between synthetic and natural distribution shifts and finds that the difference of confidences of a classifier’s predictions successfully estimates the classifier's performance change over a variety of shifts. Expand
3DB: A Framework for Debugging Computer Vision Models
TLDR
It is demonstrated, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions, and that the insights generated by the system transfer to the physical world. Expand
A comparison of approaches to improve worst-case predictive model performance over patient subpopulations
TLDR
This research presents a novel probabilistic approach to estimating the response of the immune system to chemotherapy-like symptoms in patients with central nervous system injuries. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 127 REFERENCES
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty
TLDR
AugMix significantly improves robustness and uncertainty measures on challenging image classification benchmarks, closing the gap between previous methods and the best possible performance in some cases by more than half. Expand
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
TLDR
This paper standardizes and expands the corruption robustness topic, while showing which classifiers are preferable in safety-critical applications, and proposes a new dataset called ImageNet-P which enables researchers to benchmark a classifier's robustness to common perturbations. Expand
ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models
TLDR
A highly automated platform that enables gathering datasets with controls at scale using automated tools throughout machine learning to generate datasets that exercise models in new ways thus providing valuable feedback to researchers is developed. Expand
Generalisation in humans and deep neural networks
TLDR
The robustness of humans and current convolutional deep neural networks on object recognition under twelve different types of image degradations is compared and it is shown that DNNs trained directly on distorted images consistently surpass human performance on the exact distortion types they were trained on. Expand
Do ImageNet Classifiers Generalize to ImageNet?
TLDR
The results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets. Expand
In Search of Lost Domain Generalization
TLDR
This paper implements DomainBed, a testbed for domain generalization including seven multi-domain datasets, nine baseline algorithms, and three model selection criteria, and finds that, when carefully implemented, empirical risk minimization shows state-of-the-art performance across all datasets. Expand
Natural Adversarial Examples
TLDR
It is shown that some architectural changes can enhance robustness to natural adversarial examples and be a new way to measure classifier robustness in an ImageNet classifier test set that is called ImageNet-A. Expand
A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy
TLDR
A human-centered study of a deep learning system used in clinics for the detection of diabetic eye disease in Thailand indicates that several socio-environmental factors impact model performance, nursing workflows, and the patient experience. Expand
Adversarial Examples Improve Image Recognition
TLDR
This work proposes AdvProp, an enhanced adversarial training scheme which treats adversarial examples as additional examples, to prevent overfitting, and shows that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger. Expand
...
1
2
3
4
5
...