# Approximation Spaces of Deep Neural Networks

@article{Gribonval2019ApproximationSO, title={Approximation Spaces of Deep Neural Networks}, author={R{\'e}mi Gribonval and Gitta Kutyniok and Morten Nielsen and Felix Voigtl{\"a}nder}, journal={Constructive Approximation}, year={2019}, volume={55}, pages={259-367} }

We study the expressivity of deep neural networks. Measuring a network’s complexity by its number of connections or by its number of neurons, we consider the class of functions for which the error of best approximation with networks of a given complexity decays at a certain rate when increasing the complexity budget. Using results from classical approximation theory, we show that this class can be endowed with a (quasi)-norm that makes it a linear function space, called approximation space. We…

## 69 Citations

### Sobolev-type embeddings for neural network approximation spaces

- Mathematics, Computer ScienceArXiv
- 2021

It is found that, analogous to the case of classical function spaces, it is possible to trade “smoothness” (i.e., approximation rate) for increased integrability in neural network approximation spaces, and an optimal “learning” algorithm for reconstructing functions that are well approximable by ReLU neural networks is simply given by piecewise constant interpolation on a tensor product grid.

### ReLU Network Approximation in Terms of Intrinsic Parameters

- Computer Science, MathematicsICML
- 2022

This paper shows that the number of parameters that need to be learned can be signiﬁcantly smaller than people typically expect, and conducts several experiments to verify that training a small part of parameters can also achieve good results for classi ﬁcation problems if other parameters are pre-speciﬂed or pre-trained from a related problem.

### Simultaneous neural network approximation for smooth functions

- Computer ScienceNeural Networks
- 2022

### Approximation with Tensor Networks. Part III: Multivariate Approximation

- Computer ScienceArXiv
- 2021

Tensor networks exhibit universal expressivity w.r.t. isotropic, anisotropic and mixed smoothness spaces that is comparable with more general neural networks families such as deep rectified linear unit (ReLU) networks.

### Simultaneous Neural Network Approximations in Sobolev Spaces

- Computer ScienceArXiv
- 2021

This work shows that deep ReLU networks of width O(N logN) and of depth O(L logL) can achieve a non-asymptotic approximation rate of O(n−2(s−1)/dL−2 (s−n)/d) with respect to the W([0, 1]) norm for p ∈ [1,∞).

### Integral representations of shallow neural network with Rectified Power Unit activation function

- MathematicsNeural Networks
- 2022

### Do ReLU Networks Have An Edge When Approximating Compactly-Supported Functions?

- Computer Science, Mathematics
- 2022

It is shown that polynomial regressors and analytic feedforward networks are not universal in this space, and a quantitative uniform version of the universal approximation theorem is derived on the dense subclass of compactly-supported Lipschitz functions.

### How degenerate is the parametrization of neural networks with the ReLU activation function?

- Computer Science, MathematicsNeurIPS
- 2019

The pathologies which prevent inverse stability in general are presented, and it is shown that by optimizing over such restricted sets, it is still possible to learn any function which can be learned by optimization over unrestricted sets.

### Computation complexity of deep ReLU neural networks in high-dimensional approximation

- Computer ScienceNeural Networks
- 2021

### Linear approximability of two-layer neural networks: A comprehensive analysis based on spectral decay

- Computer ScienceArXiv
- 2021

It is proved that for a family of non-smooth activation functions, including ReLU, approximating any single neuron with random features suffers from the curse of dimensionality, providing an explicit separation of expressiveness between neural networks and random feature models.

## References

SHOWING 1-10 OF 71 REFERENCES

### Optimal approximation of piecewise smooth functions using deep ReLU neural networks

- Computer ScienceNeural Networks
- 2018

### Optimal Approximation with Sparsely Connected Deep Neural Networks

- Computer ScienceSIAM J. Math. Data Sci.
- 2019

All function classes that are optimally approximated by a general class of representation systems---so-called affine systems---can be approximating by deep neural networks with minimal connectivity and memory requirements, and it is proved that the lower bounds are achievable for a broad family of function classes.

### Optimal approximation of continuous functions by very deep ReLU networks

- Computer ScienceCOLT
- 2018

It is proved that constant-width fully-connected networks of depth $L\sim W$ provide the fastest possible approximation rate $\|f-\widetilde f\|_\infty = O(\omega_f(O(W^{-2/\nu})))$ that cannot be achieved with less deep networks.

### On the Expressive Power of Deep Learning: A Tensor Analysis

- Computer ScienceCOLT 2016
- 2015

It is proved that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network.

### Deep vs. shallow networks : An approximation theory perspective

- Computer ScienceArXiv
- 2016

A new definition of relative dimension is proposed to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning.

### Neural Networks for Optimal Approximation of Smooth and Analytic Functions

- Mathematics, Computer ScienceNeural Computation
- 1996

We prove that neural networks with a single hidden layer are capable of providing an optimal order of approximation for functions assumed to possess a given number of derivatives, if the activation…

### Nonparametric regression using deep neural networks with ReLU activation function

- Computer ScienceThe Annals of Statistics
- 2020

The theory suggests that for nonparametric regression, scaling the network depth with the sample size is natural and the analysis gives some insights into why multilayer feedforward neural networks perform well in practice.

### Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2019

New upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function are proved, and there is no dependence for piecewise-constant, linear dependence for Piecewise-linear, and no more than quadratic dependence for general piece wise-polynomial.