Analysis of the rate of convergence of fully connected deep neural network regression estimates with smooth activation function

@article{Langer2021AnalysisOT,
  title={Analysis of the rate of convergence of fully connected deep neural network regression estimates with smooth activation function},
  author={Sophie Langer},
  journal={J. Multivar. Anal.},
  year={2021},
  volume={182},
  pages={104695}
}
  • Sophie Langer
  • Published 12 October 2020
  • Computer Science
  • J. Multivar. Anal.

Figures from this paper

Estimation of a regression function on a manifold by fully connected deep neural networks

VC dimension of partially quantized neural networks in the overparametrized regime

TLDR
It is shown that HANNs can have VC dimension significantly smaller than the number of weights, while being highly expressive, and empirical risk minimization over HANN's in the overparametrized regime achieves the minimax rate for classification with Lipschitz posterior class probability.

Analysis of convolutional neural network image classifiers in a rotationally symmetric model

TLDR
Under suitable structural and smoothness assumptions on the functional a posteriori probability, it is shown that least squares plug-in classifiers based on convolutional neural networks are able to circumvent the curse of dimensionality in binary image classi⬁cation if the authors neglect a resolution-dependent error term.

Research on improved convolutional wavelet neural network

TLDR
Wavelet neural network (WNN) is implemented, which can solve the problems of BPNN and RBFNN and have better performance and the proposed wavelet-based Convolutional Neural Network (WCNN) can reduce the mean square error and the error rate of CNN, which means WCNN has better maximum precision than CWNN.

Music Genre Classification Based on Deep Learning

TLDR
Experimental outcomes prove that the anticipated method can meritoriously increase the correctness of music classification and is helpful for music channel classification.

On the rate of convergence of a classifier based on a Transformer encoder

TLDR
It is shown that this Transformer classifier is able to circumvent the curse of dimensionality provided the aposteriori probability satisfies a suitable hierarchical composition model.

Neural network-aided sparse convex optimization algorithm for fast DOA estimation

TLDR
A fast sparse convex optimization algorithm based on a neural network is proposed to improve the direction of arrival estimation and its advantages in accuracy, calculation speed and robustness are verified by the simulations.

A Deep Learning Method for Monitoring Vehicle Energy Consumption with GPS Data

TLDR
The proposed monitoring method requires just a GPS sensor that is cheap and common, and the calculating procedure is so simple that you can monitor energy consumption of various vehicles in real-time with ease, however, it does not consider weight, weather and auxiliary changes.

On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

TLDR
It is shown that in case of a suitable random initialization of the network, a suitable small stepsize of the gradient descent, and a number of gradient descent steps which is slightly larger than the reciprocal of the stepsize, the estimate is universally consistent in the sense that its expected L 2 error converges to zero for all distributions of the data where the response variable is square integrable.

References

SHOWING 1-10 OF 32 REFERENCES

On the rate of convergence of fully connected very deep neural network regression estimates

TLDR
This paper shows that it is possible to get similar results also for least squares estimates based on simple fully connected neural networks with ReLU activation functions, based on new approximation results concerning deep neural networks.

Nonparametric regression using deep neural networks with ReLU activation function

TLDR
The theory suggests that for nonparametric regression, scaling the network depth with the sample size is natural and the analysis gives some insights into why multilayer feedforward neural networks perform well in practice.

Estimation of a Function of Low Local Dimensionality by Deep Neural Networks

TLDR
It is shown that the least squares regression estimates using DNNs are able to achieve dimensionality reduction in case that the regression function has locally low dimensionality.

Convergence rates for single hidden layer feedforward networks

Universal approximation bounds for superpositions of a sigmoidal function

  • A. Barron
  • Computer Science
    IEEE Trans. Inf. Theory
  • 1993
TLDR
The approximation rate and the parsimony of the parameterization of the networks are shown to be advantageous in high-dimensional settings and the integrated squared approximation error cannot be made smaller than order 1/n/sup 2/d/ uniformly for functions satisfying the same smoothness assumption.

Approximation and estimation bounds for artificial neural networks

  • A. Barron
  • Computer Science
    Machine Learning
  • 2004
TLDR
The analysis involves Fourier techniques for the approximation error, metric entropy considerations for the estimation error, and a calculation of the index of resolvability of minimum complexity estimation of the family of networks.

On deep learning as a remedy for the curse of dimensionality in nonparametric regression

TLDR
It is shown that least squares estimates based on multilayer feedforward neural networks are able to circumvent the curse of dimensionality in nonparametric regression.

The phase diagram of approximation rates for deep neural networks

TLDR
It is proved that using both sine and ReLU activations theoretically leads to very fast, nearly exponential approximation rates, thanks to the emerging capability of the network to implement efficient lookup operations.

Neural Network Learning - Theoretical Foundations

TLDR
The authors explain the role of scale-sensitive versions of the Vapnik Chervonenkis dimension in large margin classification, and in real prediction, and discuss the computational complexity of neural network learning.