Corpus ID: 232168415

Active Testing: Sample-Efficient Model Evaluation

  title={Active Testing: Sample-Efficient Model Evaluation},
  author={Jannik Kossen and Sebastian Farquhar and Y. Gal and Tom Rainforth},
We introduce a new framework for sampleefficient model evaluation that we call active testing. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of labeling test data, typically unrealistically assuming large test sets for model evaluation. This creates a disconnect to real applications, where test labels are important and just as expensive, e.g. for optimizing hyperparameters. Active testing addresses this… Expand


Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringingExpand
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits. Expand
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
Active Risk Estimation
This work empirically study conditions under which the active risk estimate is more accurate than a standard risk estimate that draws equally many instances from the test distribution. Expand
Improved Regularization of Convolutional Neural Networks with Cutout
This paper shows that the simple regularization technique of randomly masking out square regions of input during training, which is called cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Expand
Wide Residual Networks
This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts. Expand
Bayesian Active Learning for Classification and Preference Learning
This work proposes an approach that expresses information gain in terms of predictive entropies, and applies this method to the Gaussian Process Classier (GPC), and makes minimal approximations to the full information theoretic objective. Expand
On Statistical Bias In Active Learning: How and When To Fix It
It is shown that this bias can be actively helpful when training overparameterized models -- like neural networks -- with relatively little data, and novel corrective weights are introduced to remove bias when doing so is beneficial. Expand
Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning
It is shown that, unlike MFVI, Radial BNNs are robust to hyperparameters and can be efficiently applied to a challenging real-world medical application without needing ad-hoc tweaks and intensive tuning. Expand
2021) to implement the Radial BNNs. We use the following hyperparameters, which are default values taken from Farquhar et al. (2021): we use a learning rate of 1× 10−4 and weight decay
  • Radial BNN on MNIST
  • 2021