• Corpus ID: 239998556

Reliable and Trustworthy Machine Learning for Health Using Dataset Shift Detection

  title={Reliable and Trustworthy Machine Learning for Health Using Dataset Shift Detection},
  author={Chunjong Park and Anas Awadalla and Tadayoshi Kohno and Shwetak Patel},
Unpredictable ML model behavior on unseen data, especially in the health domain, raises serious concerns about its safety as repercussions for mistakes can be fatal. In this paper, we explore the feasibility of using state-of-the-art out-of-distribution detectors for reliable and trustworthy diagnostic predictions. We select publicly available deep learning models relating to various health conditions (e.g., skin cancer, lung sound, and Parkinson’s disease) using various input data types (e.g… 

Figures and Tables from this paper


A Benchmark of Medical Out of Distribution Detection
Despite methods yielding good results on some categories of out-of-distribution samples, they fail to recognize images close to the training distribution and a simple binary classifier on the feature representation has the best accuracy and AUPRC on average.
Likelihood Ratios for Out-of-Distribution Detection
This work investigates deep generative model based approaches for OOD detection and observes that the likelihood score is heavily affected by population level background statistics, and proposes a likelihood ratio method forDeep generative models which effectively corrects for these confounding background statistics.
Self-Supervised Out-of-Distribution Detection in Brain CT Scans
This paper proposes a novel self-supervised learning technique for anomaly detection that trains the model on large-sized normal scans and detect abnormal scans by calculating reconstruction error, and uses self- supervised learning with context restoration for pretraining the model.
RespireNet: A Deep Neural Network for Accurately Detecting Abnormal Lung Sounds in Limited Data Setting
This work proposes a simple CNN-based model, along with a suite of novel techniques— device specific fine-tuning, concatenation-based augmentation, blank region clipping, and smart padding—enabling us to efficiently use the small-sized dataset.
Self-Supervised Learning for Generalizable Out-of-Distribution Detection
This work proposes a new technique relying on self-supervision for generalizable out-of-distribution (OOD) feature learning and rejecting those samples at the inference time, which does not need to pre-know the distribution of targeted OOD samples and incur no extra overheads compared to other methods.
Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples
A novel training method for classifiers so that such inference algorithms can work better, and it is demonstrated its effectiveness using deep convolutional neural networks on various popular image datasets.
Detecting Out-Of-Distribution Samples Using Low-Order Deep Features Statistics
The ability to detect when an input sample was not drawn from the training distribution is an important desirable property of deep neural networks. In this paper, we show that a simple ensembling of
Calibrating Healthcare AI: Towards Reliable and Interpretable Deep Predictive Models
This paper argues that these two objectives of characterizing model reliability and enabling rigorous introspection of model behavior are not necessarily disparate and proposes to utilize prediction calibration to meet both objectives.
Detecting and Correcting for Label Shift with Black Box Predictors
Black Box Shift Estimation (BBSE) is proposed to estimate the test distribution of p(y) and it is proved BBSE works even when predictors are biased, inaccurate, or uncalibrated, so long as their confusion matrices are invertible.
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks
This paper proposes a simple yet effective method for detecting any abnormal samples, which is applicable to any pre-trained softmax neural classifier, and obtains the class conditional Gaussian distributions with respect to (low- and upper-level) features of the deep models under Gaussian discriminant analysis.