# The Modern Mathematics of Deep Learning

@article{Berner2021TheMM, title={The Modern Mathematics of Deep Learning}, author={Julius Berner and Philipp Grohs and Gitta Kutyniok and Philipp Christian Petersen}, journal={ArXiv}, year={2021}, volume={abs/2105.04026} }

We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding…

## Figures and Tables from this paper

## 30 Citations

Neurashed: A Phenomenological Model for Imitating Deep Learning Training

- Computer ScienceArXiv
- 2021

It is argued that a future deep learning theory should inherit three characteristics: a hierarchically structured network architecture, parameters iteratively optimized using stochastic gradient-based methods, and information from the data that evolves compressively.

How to Tell Deep Neural Networks What We Know

- Computer ScienceArXiv
- 2021

This paper examines the inclusion of domain-knowledge by means of changes to: the input, the loss-function, and the architecture of deep networks.

Optimal learning of high-dimensional classification problems using deep neural networks

- Computer Science, MathematicsArXiv
- 2021

For the class of locally Barron-regular decision boundaries, it is found that the optimal estimation rates are essentially independent of the underlying dimension and can be realized by empirical risk minimization methods over a suitable class of deep neural networks.

Component Transfer Learning for Deep RL Based on Abstract Representations

- Computer ScienceArXiv
- 2021

This work investigates a specific transfer learning approach for deep reinforcement learning in the context where the internal dynamics between two tasks are the same but the visual representations differ, and finds that the transfer performance is heavily reliant on the base model.

Learning Operators with Mesh-Informed Neural Networks

- Computer ScienceArXiv
- 2022

This work introduces Mesh-Informed Neural Networks (MINNs), a class of architectures specifically tailored to handle mesh based functional data, and thus of particular interest for reduced order modeling of parametrized Partial Differential Equations (PDEs).

Deep neural networks can stably solve high-dimensional, noisy, non-linear inverse problems

- MathematicsArXiv
- 2022

We study the problem of reconstructing solutions of inverse problems when only noisy measurements are available. We assume that the problem can be modeled with an inﬁnite-dimensional forward operator…

On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems

- Computer ScienceArXiv
- 2022

It is shown that the loss landscape of training problems for deep artificial neural networks with a one-dimensional real output whose activation functions contain an affine segment and whose hidden layers have width at least two possess a continuum of spurious local minima for all target functions that are not affine.

Random feature neural networks learn Black-Scholes type PDEs without curse of dimensionality

- Computer ScienceArXiv
- 2021

This article investigates the use of random feature neural networks for learning Kolmogorov partial (integro-)differential equations associated to Black-Scholes and more general exponential Lévy models and derives bounds for the prediction error of random neural Networks for learning sufficiently non-degenerate Black- Scholes type models.

A review of some techniques for inclusion of domain-knowledge into deep neural networks

- Computer ScienceScientific reports
- 2022

A survey of ways in which existing scientific knowledge is included when constructing models with neural networks by means of changes to: the input, the loss-function, and the architecture of deep networks.

Training Fully Connected Neural Networks is ∃R-Complete

- Computer ScienceArXiv
- 2022

The algorithmic problem of finding the optimal weights and biases for a two-layer fully connected neural network to a given set of data points is considered and it is shown that even very simple networks are difficult to train.