# Convergence rates of deep ReLU networks for multiclass classification

@article{Bos2021ConvergenceRO, title={Convergence rates of deep ReLU networks for multiclass classification}, author={Thijs Bos and Johannes Schmidt-Hieber}, journal={ArXiv}, year={2021}, volume={abs/2108.00969} }

For classification problems, trained deep neural networks return probabilities of class memberships. In this work we study convergence of the learned probabilities to the true conditional class probabilities. More specifically we consider sparse deep ReLU network reconstructions minimizing cross-entropy loss in the multiclass classification setup. Interesting phenomena occur when the class membership probabilities are close to zero. Convergence rates are derived that depend on the near-zero…

#### References

SHOWING 1-10 OF 33 REFERENCES

Fast convergence rates of deep neural networks for classification

- Computer Science, MedicineNeural Networks
- 2021

It is shown that the DNN classifier learned using the hinge loss achieves fast rate convergences for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity) is carefully selected.

Statistical theory for image classification using deep convolutional neural networks with cross-entropy loss

- Mathematics
- 2020

Convolutional neural networks learned by minimizing the cross-entropy loss are nowadays the standard for image classification. Till now, the statistical theory behind those networks is lacking. We…

On deep learning as a remedy for the curse of dimensionality in nonparametric regression

- MathematicsThe Annals of Statistics
- 2019

Assuming that a smoothness condition and a suitable restriction on the structure of the regression function hold, it is shown that least squares estimates based on multilayer feedforward neural…

On the rate of convergence of fully connected very deep neural network regression estimates

- Mathematics, Computer ScienceThe Annals of Statistics
- 2021

This paper shows that it is possible to get similar results also for least squares estimates based on simple fully connected neural networks with ReLU activation functions, based on new approximation results concerning deep neural networks.

A Moment Bound for Multi-hinge Classifiers

- Mathematics
- 2008

The success of support vector machines in binary classification relies on the fact that hinge loss employed in the risk minimization targets the Bayes rule. Recent research explores some extensions…

Error bounds for approximations with deep ReLU networks

- Mathematics, MedicineNeural Networks
- 2017

It is proved that deep ReLU networks more efficiently approximate smooth functions than shallow networks and adaptive depth-6 network architectures more efficient than the standard shallow architecture are described.

Fast learning rates for plug-in classifiers

- Mathematics
- 2007

It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, that is, rates faster than n -1/2 .…

Nonparametric regression using deep neural networks with ReLU activation function

- Mathematics, Computer ScienceArXiv
- 2017

The discussant contributions highlight the gaps in the theoretical understanding and outline many possible directions for future research in this area.

ImageNet classification with deep convolutional neural networks

- Computer ScienceCommun. ACM
- 2012

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Optimal aggregation of classifiers in statistical learning

- Mathematics
- 2003

Classification can be considered as nonparametric estimation of sets, where the risk is defined by means of a specific distance between sets associated with misclassification error. It is shown that…