Extending Universal Approximation Guarantees: A Theoretical Justification for the Continuity of Real-World Learning Tasks
@article{Durvasula2022ExtendingUA, title={Extending Universal Approximation Guarantees: A Theoretical Justification for the Continuity of Real-World Learning Tasks}, author={Naveen Durvasula}, journal={ArXiv}, year={2022}, volume={abs/2212.07934} }
Universal Approximation Theorems establish the density of various classes of neural network function approximators in C ( K, R m ) , where K ⊂ R n is compact. In this paper, we aim to extend these guarantees by establishing conditions on learning tasks that guarantee their continuity. We consider learning tasks given by conditional expectations x (cid:55)→ E [ Y | X = x ] , where the learning target Y = f ◦ L is a potentially pathological transformation of some underlying data-generating…
References
SHOWING 1-10 OF 29 REFERENCES
Universal Approximation with Deep Narrow Networks
- Computer Science, MathematicsCOLT 2019
- 2019
The classical Universal Approximation Theorem holds for neural networks of arbitrary width and bounded depth, and nowhere differentiable activation functions, density in noncompact domains with respect to the $L^p$-norm, and how the width may be reduced to just $n + m + 1$ for `most' activation functions.
The Expressive Power of Neural Networks: A View from the Width
- Computer ScienceNIPS
- 2017
It is shown that there exist classes of wide networks which cannot be realized by any narrow network whose depth is no more than a polynomial bound, and that narrow networks whose size exceed the polynometric bound by a constant factor can approximate wide and shallow network with high accuracy.
Lipschitz continuity of probability kernels in the optimal transport framework.
- Mathematics, Computer Science
- 2020
General conditions for the Lipschitz continuity of probability kernels with respect to metric structures arising within the optimal transport framework, such as the Wasserstein metric are given.
Minimum Width for Universal Approximation
- Computer ScienceICLR
- 2021
This work provides the first definitive result in this direction for networks using the ReLU activation functions: the minimum width required for the universal approximation of the L^p functions is exactly $\max\{d_x+1,d_y\}$.
Approximating Continuous Functions by ReLU Nets of Minimal Width
- Computer ScienceArXiv
- 2017
This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d\geq 1,$ what is the minimal…
Bayesian inverse problems for functions and applications to fluid mechanics
- Mathematics
- 2009
In this paper we establish a mathematical framework for a range of inverse problems for functions, given a finite set of noisy observations. The problems are hence underdetermined and are often…
The Bayesian Approach to Inverse Problems
- Mathematics
- 2017
These lecture notes highlight the mathematical and computational structure relating to the formulation of, and development of algorithms for, the Bayesian
approach to inverse problems in…
Approximation by superpositions of a sigmoidal function
- Computer ScienceMath. Control. Signals Syst.
- 1989
In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real…
Visualizing Data using t-SNE
- Computer Science
- 2008
A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Multilayer feedforward networks are universal approximators
- Computer Science, MathematicsNeural Networks
- 1989