• Corpus ID: 234343146

The Modern Mathematics of Deep Learning

@article{Berner2021TheMM,
  title={The Modern Mathematics of Deep Learning},
  author={Julius Berner and Philipp Grohs and Gitta Kutyniok and Philipp Christian Petersen},
  journal={ArXiv},
  year={2021},
  volume={abs/2105.04026}
}
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding… 

Neurashed: A Phenomenological Model for Imitating Deep Learning Training

TLDR
It is argued that a future deep learning theory should inherit three characteristics: a hierarchically structured network architecture, parameters iteratively optimized using stochastic gradient-based methods, and information from the data that evolves compressively.

On generalization bounds for deep networks based on loss surface implicit regularization

TLDR
It is argued that under reasonable assumptions, the local geometry of the energy landscape around local minima forces SGD to stay close to a low dimensional subspace and that this induces another form of implicit regularization and results in tighter bounds on the generalization error for deep neural networks.

How to Tell Deep Neural Networks What We Know

TLDR
This paper examines the inclusion of domain-knowledge by means of changes to: the input, the loss-function, and the architecture of deep networks.

Optimal learning of high-dimensional classification problems using deep neural networks

TLDR
For the class of locally Barron-regular decision boundaries, it is found that the optimal estimation rates are essentially independent of the underlying dimension and can be realized by empirical risk minimization methods over a suitable class of deep neural networks.

Component Transfer Learning for Deep RL Based on Abstract Representations

TLDR
This work investigates a specific transfer learning approach for deep reinforcement learning in the context where the internal dynamics between two tasks are the same but the visual representations differ, and finds that the transfer performance is heavily reliant on the base model.

Learning Operators with Mesh-Informed Neural Networks

TLDR
This work introduces Mesh-Informed Neural Networks (MINNs), a class of architectures specifically tailored to handle mesh based functional data, and thus of particular interest for reduced order modeling of parametrized Partial Differential Equations (PDEs).

Deep neural networks can stably solve high-dimensional, noisy, non-linear inverse problems

We study the problem of reconstructing solutions of inverse problems when only noisy measurements are available. We assume that the problem can be modeled with an infinite-dimensional forward operator

On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems

TLDR
It is shown that the loss landscape of training problems for deep artificial neural networks with a one-dimensional real output whose activation functions contain an affine segment and whose hidden layers have width at least two possess a continuum of spurious local minima for all target functions that are not affine.

Rosenblatt's first theorem and frugality of deep learning

TLDR
This note demonstrated Rosenblatt’s theorem at work, showed how an elementary perceptron can solve a version of the travel maze problem, and analysed the complexity of that solution, and constructed also a deep network algorithm for the same problem.

Random feature neural networks learn Black-Scholes type PDEs without curse of dimensionality

TLDR
This article investigates the use of random feature neural networks for learning Kolmogorov partial (integro-)differential equations associated to Black-Scholes and more general exponential Lévy models and derives bounds for the prediction error of random neural Networks for learning sufficiently non-degenerate Black- Scholes type models.