• Corpus ID: 10136783

Learning Real and Boolean Functions: When Is Deep Better Than Shallow

  title={Learning Real and Boolean Functions: When Is Deep Better Than Shallow},
  author={Hrushikesh Narhar Mhaskar and Qianli Liao and Tomaso A. Poggio},
This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF 1231216. HNM was supported in part by ARO Grant W911NF-15-1-0385. 

Figures from this paper

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning
This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216.
Theory I: Why and When Can Deep Networks Avoid the Curse of Dimensionality?
The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though
Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review
An emerging body of theoretical results on deep learning including the conditions under which it can be exponentially better than shallow learning are reviewed, together with new results, open problems and conjectures.
Learning Functions: When Is Deep Better Than Shallow
It is proved that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension.
Classifying Three-input Boolean Functions by Neural Networks
  • Naoshi Sakamoto
  • Computer Science
    2019 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)
  • 2019
By focusing on the structure of Boolean functions, this study finds that the minimum number of required epochs can be estimated according to the property of disjunctive normal form of the Boolean function.
When and Why Are Deep Networks Better Than Shallow Ones?
This theorem proves an old conjecture by Bengio on the role of depth in networks, characterizing precisely the conditions under which it holds and suggests possible answers to the the puzzle of why high-dimensional deep networks trained on large training sets often do not seem to show overfit.
Limitations of shallow nets approximation
Error bounds for approximations with deep ReLU networks
Deep vs. shallow networks : An approximation theory perspective
A new definition of relative dimension is proposed to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning.
Limitations of shallow networks representing finite mappings
  • V. Kůrková
  • Computer Science
    Neural Computing and Applications
  • 2018
Limitations of capabilities of shallow networks to efficiently compute real-valued functions on finite domains are investigated and connections to the No Free Lunch Theorem and the central paradox of coding theory are discussed.


A Provably Efficient Algorithm for Training Deep Networks
The main goal of this paper is the derivation of a provably efficient, layer-by-layer, algorithm for training deep neural networks, which is denote as the Basis Learner, which comes with formal polynomial time convergence guarantees.
Representation Benefits of Deep Feedforward Networks
This note provides a family of classification problems, indexed by a positive integer $k$, where all shallow networks with fewer than exponentially (in $k$) many nodes exhibit error at least $1/6$,
Scaling learning algorithms towards AI
It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.
I-theory on depth vs width: hierarchical function composition
It is proved that recognition invariant to translation cannot be computed by shallow networks in the presence of clutter, and a general framework that includes the compositional case is sketched.
Shallow vs. Deep Sum-Product Networks
It is proved there exist families of functions that can be represented much more efficiently with a deep network than with a shallow one, i.e. with substantially fewer hidden units.
The Mathematics of Learning: Dealing with Data
  • T. Poggio, S. Smale
  • Computer Science
    2005 International Conference on Neural Networks and Brain
  • 2005
The mathematical foundations of learning theory are outlined and a key algorithm of it is described, which is key to developing systems tailored to a broad range of data analysis and information extraction tasks.
Learning Boolean Functions via the Fourier Transform
The Fourier Transform representation for functions whose inputs are boolean has been far less studied, but it seems that it can be used to learn many classes of boolean functions.
Neural Network Learning - Theoretical Foundations
The authors explain the role of scale-sensitive versions of the Vapnik Chervonenkis dimension in large margin classification, and in real prediction, and discuss the computational complexity of neural network learning.
On the Number of Linear Regions of Deep Neural Networks
We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.