• Corpus ID: 234343146

The Modern Mathematics of Deep Learning

  title={The Modern Mathematics of Deep Learning},
  author={Julius Berner and Philipp Grohs and Gitta Kutyniok and Philipp Christian Petersen},
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding… 
Neurashed: A Phenomenological Model for Imitating Deep Learning Training
It is argued that a future deep learning theory should inherit three characteristics: a hierarchically structured network architecture, parameters iteratively optimized using stochastic gradient-based methods, and information from the data that evolves compressively.
How to Tell Deep Neural Networks What We Know
This paper examines the inclusion of domain-knowledge by means of changes to: the input, the loss-function, and the architecture of deep networks.
Optimal learning of high-dimensional classification problems using deep neural networks
For the class of locally Barron-regular decision boundaries, it is found that the optimal estimation rates are essentially independent of the underlying dimension and can be realized by empirical risk minimization methods over a suitable class of deep neural networks.
Component Transfer Learning for Deep RL Based on Abstract Representations
This work investigates a specific transfer learning approach for deep reinforcement learning in the context where the internal dynamics between two tasks are the same but the visual representations differ, and finds that the transfer performance is heavily reliant on the base model.
Learning Operators with Mesh-Informed Neural Networks
This work introduces Mesh-Informed Neural Networks (MINNs), a class of architectures specifically tailored to handle mesh based functional data, and thus of particular interest for reduced order modeling of parametrized Partial Differential Equations (PDEs).
Deep neural networks can stably solve high-dimensional, noisy, non-linear inverse problems
We study the problem of reconstructing solutions of inverse problems when only noisy measurements are available. We assume that the problem can be modeled with an infinite-dimensional forward operator
On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems
It is shown that the loss landscape of training problems for deep artificial neural networks with a one-dimensional real output whose activation functions contain an affine segment and whose hidden layers have width at least two possess a continuum of spurious local minima for all target functions that are not affine.
Random feature neural networks learn Black-Scholes type PDEs without curse of dimensionality
This article investigates the use of random feature neural networks for learning Kolmogorov partial (integro-)differential equations associated to Black-Scholes and more general exponential Lévy models and derives bounds for the prediction error of random neural Networks for learning sufficiently non-degenerate Black- Scholes type models.
A review of some techniques for inclusion of domain-knowledge into deep neural networks
A survey of ways in which existing scientific knowledge is included when constructing models with neural networks by means of changes to: the input, the loss-function, and the architecture of deep networks.
Training Fully Connected Neural Networks is ∃R-Complete
The algorithmic problem of finding the optimal weights and biases for a two-layer fully connected neural network to a given set of data points is considered and it is shown that even very simple networks are difficult to train.