# Nonparametric regression using deep neural networks with ReLU activation function

@article{SchmidtHieber2020NonparametricRU, title={Nonparametric regression using deep neural networks with ReLU activation function}, author={Johannes Schmidt-Hieber}, journal={ArXiv}, year={2020}, volume={abs/1708.06633} }

Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to $\log n$-factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network…

## Figures from this paper

## 347 Citations

Robust Nonparametric Regression with Deep Neural Networks

- Computer Science, Mathematics
- 2021

Simulation studies demonstrate that the robust methods can significantly outperform the least squares method when the errors have heavy-tailed distributions and illustrate that the choice of loss function is important in the context of deep nonparametric regression.

ROBUST NONPARAMETRIC REGRESSION WITH DEEP NEURAL NETWORKS

- Computer Science, Mathematics
- 2021

Simulation studies demonstrate that the robust methods can significantly outperform the least squares method when the errors have heavy-tailed distributions and illustrate that the choice of loss function is important in the context of deep nonparametric regression.

On the rate of convergence of fully connected very deep neural network regression estimates

- Computer ScienceThe Annals of Statistics
- 2021

This paper shows that it is possible to get similar results also for least squares estimates based on simple fully connected neural networks with ReLU activation functions, based on new approximation results concerning deep neural networks.

ON THE RATE OF CONVERGENCE OF FULLY CONNECTED DEEP NEURAL NETWORK REGRESSION ESTIMATES BY MICHAEL KOHLER*

- Computer Science
- 2021

This paper shows that it is possible to get similar results also for least squares estimates based on simple fully connected neural networks with ReLU activation functions, based on new approximation results concerning deep neural networks.

A comparison of deep networks with ReLU activation function and linear spline-type methods

- Computer ScienceNeural Networks
- 2019

How do noise tails impact on deep ReLU networks?

- Computer Science
- 2022

How the optimal rate of convergence depends on p, the degree of smoothness and the intrinsic dimension in a class of nonparametric regression functions with hierarchical composition structure when both the adaptive Huber loss and deep ReLU neural networks are used is unveiled.

Measurement error models: from nonparametric methods to deep neural networks

- Computer ScienceArXiv
- 2020

This paper proposes an efficient neural network design for estimating measurement error models, which utilizes recent advances in variational inference for deep neural networks, such as the importance weight autoencoder, doubly reparametrized gradient estimator, and non-linear independent components estimation.

Analysis of the rate of convergence of fully connected deep neural network regression estimates with smooth activation function

- Computer ScienceJ. Multivar. Anal.
- 2021

Statistical Learning using Sparse Deep Neural Networks in Empirical Risk Minimization

- Computer Science
- 2021

It is derived that the SDRN estimator can achieve the same minimax rate of estimation as one-dimensional nonparametric regression when the dimension of the features is fixed, and the estimator has a suboptimal rate when thedimension grows with the sample size.

Analysis of the rate of convergence of neural network regression estimates which are easy to implement.

- Computer Science, Mathematics
- 2019

This article introduces a new neural network regression estimate where most of the weights are chosen regardless of the data motivated by some recent approximation results for neural networks, and which is therefore easy to implement and which achieves the one-dimensional rate of convergence.

## References

SHOWING 1-10 OF 91 REFERENCES

A comparison of deep networks with ReLU activation function and linear spline-type methods

- Computer ScienceNeural Networks
- 2019

Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science

- Computer ScienceNature Communications
- 2018

A method to design neural networks as sparse scale-free networks, which leads to a reduction in computational time required for training and inference, which has the potential to enable artificial neural networks to scale up beyond what is currently possible.

Approximation and Estimation for High-Dimensional Deep Learning Networks

- Computer ScienceArXiv
- 2018

The heart of the analysis is the development of a sampling strategy that demonstrates the accuracy of a sparse covering of deep ramp networks, and lower bounds show that the identified risk is close to being optimal.

Breaking the Curse of Dimensionality with Convex Neural Networks

- Computer ScienceJ. Mach. Learn. Res.
- 2017

This work considers neural networks with a single hidden layer and non-decreasing homogeneous activa-tion functions like the rectified linear units and shows that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace.

Optimal approximation of continuous functions by very deep ReLU networks

- Computer ScienceCOLT
- 2018

It is proved that constant-width fully-connected networks of depth $L\sim W$ provide the fastest possible approximation rate $\|f-\widetilde f\|_\infty = O(\omega_f(O(W^{-2/\nu})))$ that cannot be achieved with less deep networks.

Why Deep Neural Networks for Function Approximation?

- Computer ScienceICLR
- 2017

It is shown that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neuron needs by a deep network for a given degree of function approximation.

On the Number of Linear Regions of Deep Neural Networks

- Computer ScienceNIPS
- 2014

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep…

Adaptive Approximation and Estimation of Deep Neural Network to Intrinsic Dimensionality

- Computer ScienceArXiv
- 2019

It is theoretically proved that the generalization performance of deep neural networks (DNNs) is mainly determined by an intrinsic low-dimensional structure of data, and DNNs outperform other non-parametric estimators which are also adaptive to the intrinsic dimension.

Deep ReLU network approximation of functions on a manifold

- Computer Science, MathematicsArXiv
- 2019

This work studies a regression problem with inputs on a $d^*$-dimensional manifold that is embedded into a space with potentially much larger ambient dimension, and derives statistical convergence rates for the estimator minimizing the empirical risk over all possible choices of bounded network parameters.

Convergence rates for single hidden layer feedforward networks

- Computer Science, MathematicsNeural Networks
- 1994