Fractional deep neural network via constrained optimization

  title={Fractional deep neural network via constrained optimization},
  author={Harbir Antil and Ratna Khatri and Rainald L{\"o}hner and Deepanshu Verma},
  journal={Machine Learning: Science and Technology},
This paper introduces a novel algorithmic framework for a deep neural network (DNN), which in a mathematically rigorous manner, allows us to incorporate history (or memory) into the network—it ensures all layers are connected to one another. This DNN, called Fractional-DNN, can be viewed as a time-discretization of a fractional in time non-linear ordinary differential equation (ODE). The learning problem then is a minimization problem subject to that fractional ODE as constraints. We emphasize… 

An Optimal Time Variable Learning Framework for Deep Neural Networks

The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework.

Deep neural nets with fixed bias configuration

A Moreau-Yosida regularization based algorithm is proposed to handle inequality constraints on the bias vectors in each layer of a neural network and a theoretical convergence of this algorithm is established.

On quadrature rules for solving Partial Differential Equations using Neural Networks

Optimal Control, Numerics, and Applications of Fractional PDEs

Data Assimilation with Deep Neural Nets Informed by Nudging

This work proposes a new approach to data assimilation via machine learning where Deep Neural Networks (DNNs) are being taught the nudging algorithm, and standard exponential type approximation results are established for the Lorenz 63 model for both the continuous and discrete in time models.

Artificial neural networks: a practical review of applications involving fractional calculus

In this work, a bibliographic analysis on artificial neural networks (ANNs) using fractional calculus (FC) theory has been developed to summarize the main features and applications of the ANNs. ANN

NINNs: Nudging Induced Neural Networks

NINNs offer multiple advantages, for instance, they lead to higher accuracy when compared with existing data assimilation algorithms such as nudging, and Rigorous convergence analysis is established for NINNs.

Deep learning or interpolation for inverse modelling of heat and fluid flow problems?

The results indicate that interpolation algorithms outperform deep neural networks in accuracy for linear heat conduction, while the reverse is true for nonlinearHeat conduction problems, both methods offer similar levels of accuracy.

Novel DNNs for Stiff ODEs with Applications to Chemically Reacting Flows

Experimental results show that it is helpful to account for the physical properties of species while designing DNNs, and the proposed approach to approximate stiff ODEs is shown to generalize well.



Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

It is shown that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations and established a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks.

Deep Neural Networks Motivated by Partial Differential Equations

A new PDE interpretation of a class of deep convolutional neural networks (CNN) that are commonly used to learn from speech, image, and video data is established and three new ResNet architectures are derived that fall into two new classes: parabolic and hyperbolic CNNs.

fPINNs: Fractional Physics-Informed Neural Networks

This work extends PINNs to fractional PINNs (fPINNs) to solve space-time fractional advection-diffusion equations (fractional ADEs), and demonstrates their accuracy and effectiveness in solving multi-dimensional forward and inverse problems with forcing terms whose values are only known at randomly scattered spatio-temporal coordinates (black-box forcing terms).

Layer-Parallel Training of Deep Residual Neural Networks

Using numerical examples from supervised classification, it is demonstrated that the new approach achieves similar training performance to traditional methods, but enables layer-parallelism and thus provides speedup over layer-serial methods through greater concurrency.

Stable architectures for deep neural networks

This paper relates the exploding and vanishing gradient phenomenon to the stability of the discrete ODE and presents several strategies for stabilizing deep learning for very deep networks.

Deep learning as optimal control problems: models and numerical methods

This work considers recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint, and compares these deep learning algorithms numerically in terms of induced flow and generalisation ability.

Bilevel optimization, deep learning and fractional Laplacian regularization with applications in tomography

The key advantage of using fractional Laplacian as a regularizer is that it leads to a linear operator, as opposed to the total variation regularization which results in a nonlinear degenerate operator.

Deep Neural Networks Learn Non-Smooth Functions Effectively

It is shown that the estimators by DNNs are almost optimal to estimate the non-smooth functions, while some of the popular models do not attain the optimal rate.

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.