• Corpus ID: 160009610

Universal Approximation with Deep Narrow Networks

@article{Kidger2019UniversalAW,
  title={Universal Approximation with Deep Narrow Networks},
  author={Patrick Kidger and Terry Lyons},
  journal={ArXiv},
  year={2019},
  volume={abs/1905.08539}
}
The classical Universal Approximation Theorem holds for neural networks of arbitrary width and bounded depth. Here we consider the natural `dual' scenario for networks of bounded width and arbitrary depth. Precisely, let $n$ be the number of inputs neurons, $m$ be the number of output neurons, and let $\rho$ be any nonaffine continuous function, with a continuous nonzero derivative at some point. Then we show that the class of neural networks of arbitrary depth, width $n + m + 2$, and… 

Figures from this paper

Minimum Width for Universal Approximation
TLDR
This work provides the first definitive result in this direction for networks using the ReLU activation functions: the minimum width required for the universal approximation of the L^p functions is exactly $\max\{d_x+1,d_y\}$.
Arbitrary-Depth Universal Approximation Theorems for Operator Neural Networks
TLDR
It is proved that for non-polynomial activation functions that are continuously differentiable at a point with a nonzero derivative, one can construct an operator NN of width five, whose inputs are real numbers with finite decimal representations, that is arbitrarily close to any given continuous nonlinear operator.
Quantitative Rates and Fundamental Obstructions to Non-Euclidean Universal Approximation with Deep Narrow Feed-Forward Networks
TLDR
The number of narrow layers required for these ”deep geometric feed-forward neural networks” (DGNs) to approximate any continuous function in C(X,Y), uniformly on compacts is quantified and a quantitative version of the universal approximation theorem is obtained.
Abstract Universal Approximation for Neural Networks
TLDR
The AUA theorem tells us that there exists a neural network that approximates f and for which the authors can automatically construct proofs of robustness using the interval abstract domain, and sheds light on the existence of provably correct neural networks.
Width is Less Important than Depth in ReLU Neural Networks
TLDR
It is shown that depth plays a more significant role than width in the expressive power of neural networks, and an exact representation of wide and shallow networks using deep and narrow networks which, in certain cases, does not increase the number of parameters over the target network.
Universal Approximation Under Constraints is Possible with Transformers
TLDR
A quantitative constrained universal approximation theorem which guarantees that for any convex or non-convex compact set K and any continuous function f : R → K, there is a probabilistic transformer F̂ whose randomized outputs all lie in K and whose expected output uniformly approximates f .
Universal approximation power of deep residual neural networks via nonlinear control theory
TLDR
The universal approximation capabilities of deep residual neural networks through geometric nonlinear control are explained and monotonicity is identified as the bridge between controllability of finite ensembles and uniform approximability on compact sets.
Piecewise-Linear Activations or Analytic Activation Functions: Which Produce More Expressive Neural Networks?
TLDR
The main result demonstrates that deep networks with piecewise linear activation (e.g. ReLU or PReLU) are fundamentally more expressive than deep feedforward networks with analytic activation functions and is further explained by quantitatively demonstrating the “separation phenomenon” between the networks in NN ReLU + Pool.
Universal Approximation Power of Deep Neural Networks via Nonlinear Control Theory
TLDR
This paper provides a general sufficient condition for a residual network to have the power of universal approximation by asking the activation function, or one of its derivatives, to satisfy a quadratic differential equation.
Characterizing the Universal Approximation Property
TLDR
This paper constructs a modification of the feed-forward architecture, which can approximate any continuous function, with a controlled growth rate, uniformly on the entire domain space, and it is shown that theFeed- forward architecture typically cannot.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
Approximating Continuous Functions by ReLU Nets of Minimal Width
This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d\geq 1,$ what is the minimal
The Expressive Power of Neural Networks: A View from the Width
TLDR
It is shown that there exist classes of wide networks which cannot be realized by any narrow network whose depth is no more than a polynomial bound, and that narrow networks whose size exceed the polynometric bound by a constant factor can approximate wide and shallow network with high accuracy.
Understanding Deep Neural Networks with Rectified Linear Units
TLDR
The gap theorems hold for smoothly parametrized families of "hard" functions, contrary to countable, discrete families known in the literature, and a new lowerbound on the number of affine pieces is shown, larger than previous constructions in certain regimes of the network architecture.
Optimal approximation of piecewise smooth functions using deep ReLU neural networks
Error bounds for approximations with deep ReLU neural networks in $W^{s, p}$ norms
TLDR
This work constructs, based on a calculus of ReLU networks, artificial neural networks with ReLU activation functions that achieve certain approximation rates and establishes lower bounds for the approximation by ReLU neural networks for classes of Sobolev-regular functions.
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
TLDR
By exploiting depth, it is shown that 3-layer ReLU networks with $\Omega(\sqrt{N})$ hidden nodes can perfectly memorize most datasets with $N$ points, and it is proved that width $\Theta($N)$ is necessary and sufficient for memorizing data points, proving tight bounds on memorization capacity.
Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
TLDR
This paper proves that neural networks can efficiently approximate functions supported on low dimensional manifolds, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function.
Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units
TLDR
This analysis covers discrete restricted Boltzmann machines and naive Bayes models as special cases and shows that a q-ary deep belief network with layers of width for some can approximate any probability distribution on without exceeding a Kullback-Leibler divergence.
Nonlinear Approximation and (Deep) ReLU Networks
TLDR
The main results of this article prove that neural networks possess even greater approximation power than these traditional methods of nonlinear approximation, and exhibiting large classes of functions which can be efficiently captured by neural networks where classical nonlinear methods fall short of the task.
How degenerate is the parametrization of neural networks with the ReLU activation function?
TLDR
The pathologies which prevent inverse stability in general are presented, and it is shown that by optimizing over such restricted sets, it is still possible to learn any function which can be learned by optimization over unrestricted sets.
...
1
2
3
4
...