# Performance Prediction Under Dataset Shift

@article{Maggio2022PerformancePU,
title={Performance Prediction Under Dataset Shift},
author={Simona Maggio and Victor Bouvier and Leo Dreyfus-Schmidt},
journal={2022 26th International Conference on Pattern Recognition (ICPR)},
year={2022},
pages={2466-2474}
}
• Published 21 June 2022
• Computer Science
• 2022 26th International Conference on Pattern Recognition (ICPR)
ML models deployed in production often have to face unknown domain changes, fundamentally different from their training settings. Performance prediction models carry out the crucial task of measuring the impact of these changes on model performance. We study the generalization capabilities of various performance prediction models to new domains by learning on generated synthetic perturbations. Empirical validation on a benchmark of ten tabular datasets shows that models based upon state-of-the…

## References

SHOWING 1-10 OF 20 REFERENCES

• Computer Science
AAAI
• 2021
This work uses transfer learning to train an uncertainty model to estimate the uncertainty of model performance predictions, and believes this result makes prediction intervals, and performance prediction in general, significantly more practical for real-world use.
• Computer Science
HILDA@SIGMOD
• 2019
This work proposes an approach to assist non-ML experts working with pretrained ML models with a performance predictor for pretrained black box models, which can be combined with the model, and automatically warns end users in case of unexpected performance drops.
• Computer Science
2021 IEEE/CVF International Conference on Computer Vision (ICCV)
• 2021
This investigation determines that common distributional distances, such as Frechet distance or Maximum Mean Discrepancy, fail to induce reliable estimates of performance under distribution shift, and finds that the proposed difference of confidences (DoC) approach yields successful estimates of a classifier’s performance over a variety of shifts and model architectures.
• Computer Science
ICLR
• 2022
Average Thresholded Confidence (ATC) is proposed, a practical method that learns a threshold on the model’s confidence, predicting accuracy as the fraction of unlabeled examples for which model confidence exceeds that threshold.
• Computer Science
EMNLP
• 2019
This paper investigates three families of methods (\mathcal{H}-divergence, reverse classification accuracy and confidence measures), shows how they can be used to predict the performance drop and study their robustness to adversarial domain-shifts.
• Computer Science
NeurIPS
• 2019
A large-scale benchmark of existing state-of-the-art methods on classification problems and the effect of dataset shift on accuracy and calibration is presented, finding that traditional post-hoc calibration does indeed fall short, as do several other previous methods.
• Computer Science
ArXiv
• 2020
This work shows the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain, and shows that this problem appears in a wide variety of practical ML pipelines.
• Computer Science
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2021
This work constructs a meta-dataset: a dataset comprised of datasets generated from the original images via various transformations such as rotation, background substitution, foreground scaling, etc, and reports a reasonable and promising prediction of the model accuracy.
• Computer Science
NeurIPS
• 2020
It is found that there is often little to no transfer of robustness from current synthetic to natural distribution shift, and the results indicate that distribution shifts arising in real data are currently an open research problem.
• Computer Science
ICML
• 2018
Black Box Shift Estimation (BBSE) is proposed to estimate the test distribution of p(y) and it is proved BBSE works even when predictors are biased, inaccurate, or uncalibrated, so long as their confusion matrices are invertible.