We prove that neural networks with a single hidden layer are capable of providing an optimal order of approximation for functions assumed to possess a given number of derivatives, if the activation function evaluated by each principal element satisfies certain technical conditions.Expand

The paper reviews and extends an emerging body of theoretical results on deep learning including the conditions under which it can be exponentially better than shallow learning.Expand

We prove that feedforward artificial neural networks with a single hidden layer and an ideal sigmoidal response function cannot provide localized approximation in a Euclidean space of dimension… Expand

We prove that an artificial neural network with multiple hidden layers and akth-order sigmoidal response function can be used to approximate any continuous function on any compact subset of a Euclidean space so as to achieve the Jackson rate of approximation.Expand

Let @s: R -> R be such that for some polynomial P, @sP is bounded. We consider the linear span of the functions {@s(@l . (x - t)): @l, t @e R^s}. We prove that unless @s is itself a polynomial, it is… Expand

For the extremal problem: E„r(a):= min||exp(-W«)(x-+ ■■■)\\L„ a > 0, where U (0 < r < oo) denotes the usual integral norm over R, and the minimum is taken over all monic polynomials of degree n, we… Expand

We prove that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension.Expand

We obtain quadrature formulas that are exact for spherical harmonics of a fixed order, have nonnegative weights, and are based on function values at scattered sites.Expand

We construct a multiscale tight frame based on an arbitrary orthonormal basis for the L2 space of an arbitrary sigma finite measure space. The approximation properties of the resulting multiscale are… Expand