• Corpus ID: 227013220

Neural network approximation and estimation of classifiers with classification boundary in a Barron class

  title={Neural network approximation and estimation of classifiers with classification boundary in a Barron class},
  author={Andrei Caragea and Philipp Christian Petersen and Felix Voigtlaender},
  journal={arXiv: Functional Analysis},
We prove bounds for the approximation and estimation of certain classification functions using ReLU neural networks. Our estimation bounds provide a priori performance guarantees for empirical risk minimization using networks of a suitable size, depending on the number of training samples available. The obtained approximation and estimation rates are independent of the dimension of the input, showing that the curse of dimension can be overcome in this setting; in fact, the input dimension only… 

Figures from this paper

Optimal learning of high-dimensional classification problems using deep neural networks

For the class of locally Barron-regular decision boundaries, it is found that the optimal estimation rates are essentially independent of the underlying dimension and can be realized by empirical risk minimization methods over a suitable class of deep neural networks.

Sobolev-type embeddings for neural network approximation spaces

It is found that, analogous to the case of classical function spaces, it is possible to trade “smoothness” (i.e., approximation rate) for increased integrability in neural network approximation spaces, and an optimal “learning” algorithm for reconstructing functions that are well approximable by ReLU neural networks is simply given by piecewise constant interpolation on a tensor product grid.

Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces

This work proves hardness results for the problems of approximation and integration on a novel class of neural network approximation spaces by confirming a conjectured and empirically observed theory-to-practice gap in deep learning and shows that approximation rates of a comparable order of convergence are (at least theoretically) achievable.

Optimal bump functions for shallow ReLU networks: Weight decay, depth separation and the curse of dimensionality

It is proved that a unique radially symmetric minimizer exists, whose weight decay regularizer and Lipschitz constant grow as d and √ d respectively, and it is shown that the weight decayRegularizer grows exponentially in d if the label 1 is imposed on a ball of radius ε rather than just at the origin.

Random feature neural networks learn Black-Scholes type PDEs without curse of dimensionality

This article investigates the use of random feature neural networks for learning Kolmogorov partial (integro-)differential equations associated to Black-Scholes and more general exponential Lévy models and derives bounds for the prediction error of random neural Networks for learning sufficiently non-degenerate Black- Scholes type models.

$L^p$ sampling numbers for the Fourier-analytic Barron space

In this paper, we consider Barron functions f : [0 , 1] d → R of smoothness σ > 0, which are functions that can be written as f ( x ) = Z R d F ( ξ ) e 2 πi h x,ξ i dξ with Z R d | F ( ) | · σ < ∞ .

Stationary Density Estimation of Itô Diffusions Using Deep Learning

This theoretical study relies on a recent perturbation theory of Markov chain result that shows a linear dependence of the density estimation to the error in estimating the drift term, and generalization error results of nonparametric regression and of PDE regression solution obtained with neural-network models.

Deep neural networks for solving extremely large linear systems

Several numerical examples arising from partial differential equations, queueing problems and probabilistic Boolean networks are presented to demonstrate that solutions of linear systems with sizes ranging from septillion to nonillion can be learned quite accurately.



Approximation and Estimation for High-Dimensional Deep Learning Networks

The heart of the analysis is the development of a sampling strategy that demonstrates the accuracy of a sparse covering of deep ramp networks, and lower bounds show that the identified risk is close to being optimal.

A Priori Estimates of the Generalization Error for Two-layer Neural Networks

New estimates for the generalization error are established for the two-layer neural network model. These new estimates are a priori in nature in the sense that the bounds depend only on some norms of


New estimates for the population risk are established for two-layer neural networks and are a priori in nature in the sense that the bounds depend only on some norms of the underlying functions to be fitted, not the parameters in the model, in contrast with most existing results which are a posteriora in nature.

Advantage of Deep Neural Networks for Estimating Functions with Singularity on Curves

The generalization error of a DNN estimator is derived and it is proved that its convergence rate is almost optimal, and a certain class of common models are sub-optimal, including linear estimators and other harmonic analysis methods such as wavelets and curvelets.

Adaptive Approximation and Generalization of Deep Neural Network with Intrinsic Dimensionality

This study derives bounds for an approximation error and a generalization error regarding DNNs with intrinsically low dimensional covariates and proves that an intrinsic low dimensionality of covariates is the main factor that determines the performance of deep neural networks.

Optimal Approximation Rates and Metric Entropy of ReLU$^k$ and Cosine Networks

This article establishes the precise asymptotics of the L-metric entropy of the unit ball of these guage spaces and, as a consequence, the optimal approximation rates for shallow ReLU networks.

Deep Neural Network Approximation Theory

Deep networks provide exponential approximation accuracy—i.e., the approximation error decays exponentially in the number of nonzero weights in the network— of the multiplication operation, polynomials, sinusoidal functions, and certain smooth functions.

Universal approximation bounds for superpositions of a sigmoidal function

  • A. Barron
  • Computer Science
    IEEE Trans. Inf. Theory
  • 1993
The approximation rate and the parsimony of the parameterization of the networks are shown to be advantageous in high-dimensional settings and the integrated squared approximation error cannot be made smaller than order 1/n/sup 2/d/ uniformly for functions satisfying the same smoothness assumption.

Approximation and estimation bounds for artificial neural networks

  • A. Barron
  • Computer Science
    Machine Learning
  • 2004
The analysis involves Fourier techniques for the approximation error, metric entropy considerations for the estimation error, and a calculation of the index of resolvability of minimum complexity estimation of the family of networks.

Approximation and Estimation Bounds for Artificial Neural Networks

Binding of the mean integrated squared error between the estimated network and a target function f is shown to be bounded by O(Cf((d/N)log N)1/2), which demonstrates surprisingly favorable properties of network estimation compared to traditional series and nonparametric curve estimation techniques in the case that d is moderately large.