• Corpus ID: 235358945

# Evaluating State-of-the-Art Classification Models Against Bayes Optimality

@inproceedings{Theisen2021EvaluatingSC,
title={Evaluating State-of-the-Art Classification Models Against Bayes Optimality},
author={Ryan Theisen and Huan Wang and Lav R. Varshney and Caiming Xiong and Richard Socher},
booktitle={NeurIPS},
year={2021}
}
• Published in NeurIPS 7 June 2021
• Computer Science
Evaluating the inherent difficulty of a given data-driven classification problem is important for establishing absolute benchmarks and evaluating progress in the field. To this end, a natural quantity to consider is the Bayes error, which measures the optimal classification error theoretically achievable for a given data distribution. While generally an intractable quantity, we show that we can compute the exact Bayes error of generative models learned using normalizing flows. Our technique…
2 Citations

## Figures and Tables from this paper

### Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification

• Computer Science
ArXiv
• 2022
A simple and direct Bayes error estimator, where the mean of the labels that show uncertainty of the classes is taken, which has no hyperparameters and gives a more accurate estimate of the Bayeserror than classi-er-based baselines.

### SpinalNet: Deep Neural Network with Gradual Input

• Computer Science
IEEE Transactions on Artificial Intelligence
• 2022
The human somatosensory system is studied and the SpinalNet is proposed to achieve higher accuracy with less computational resources and the vanishing gradient problem does not exist.

## References

SHOWING 1-10 OF 40 REFERENCES

### Learning to Benchmark: Determining Best Achievable Misclassification Error from Training Data

• Computer Science
ArXiv
• 2019
This work proposes a benchmark learner based on an ensemble of $\epsilon$-ball estimators and Chebyshev approximation that achieves an optimal (parametric) mean squared error (MSE) rate of $O(N^{-1})$, where $N$ is the number of samples.

### Meta learning of bounds on the Bayes classifier error

• Computer Science
2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)
• 2015
This work estimates multiple bounds on the Bayes error using an estimator that applies meta learning to slowly converging plug-in estimators to obtain the parametric convergence rate.

### Understanding the Limitations of Conditional Generative Models

• Computer Science
ICLR
• 2020
The theoretical result reveals that it is impossible to guarantee detectability of adversarially-perturbed inputs even for near-optimal generative classifiers, and the results indicate that likelihood-based conditional generative models may are surprisingly ineffective for robust classification.

### Sharpness-Aware Minimization for Efficiently Improving Generalization

• Computer Science
ICLR
• 2021
This work introduces a novel, effective procedure for simultaneously minimizing loss value and loss sharpness, Sharpness-Aware Minimization (SAM), which improves model generalization across a variety of benchmark datasets and models, yielding novel state-of-the-art performance for several.

### Multivariate f-divergence Estimation With Confidence

• Computer Science, Mathematics
NIPS
• 2014
This work establishes the asymptotic normality of a recently proposed ensemble estimator of f-divergence between two distributions from a finite number of samples, which has MSE convergence rate of O (1/T), is simple to implement, and performs well in high dimensions.

### Ensemble Estimation of Information Divergence †

• Computer Science, Mathematics
Entropy
• 2018
An empirical estimator of Rényi-α divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions and is shown to be robust to the choice of tuning parameters.

### DIME: An Information-Theoretic Difficulty Measure for AI Datasets

• Computer Science
• 2019
This work proposes DIME, an information-theoretic DIfficulty MEasure for datasets, based on Fano’s inequality and a neural network estimation of the conditional entropy of the sample-label distribution, which can be decomposed into components attributable to the data distribution and the number of samples.

### Normalizing Flows for Probabilistic Modeling and Inference

• Computer Science
J. Mach. Learn. Res.
• 2021
This review places special emphasis on the fundamental principles of flow design, and discusses foundational topics such as expressive power and computational trade-offs, and summarizes the use of flows for tasks such as generative modeling, approximate inference, and supervised learning.

### Nonparametric Divergence Estimation with Applications to Machine Learning on Distributions

• Computer Science
UAI
• 2011
Estimation algorithms are presented, how to apply them for machine learning tasks on distributions are described, and empirical results on synthetic data, real word images, and astronomical data sets are shown.