Deep Unitary Convolutional Neural Networks

  title={Deep Unitary Convolutional Neural Networks},
  author={Hao-Yuan Chang and Kang L. Wang},
  • Hao-Yuan Chang, Kang L. Wang
  • Published in ICANN 23 February 2021
  • Computer Science
Deep neural networks can suffer from the exploding and vanishing activation problem, in which the networks fail to train properly because the neural signals either amplify or attenuate across the layers and become saturated. While other normalization methods aim to fix the stated problem, most of them have inference speed penalties in those applications that require running averages of the neural activations. Here we extend the unitary framework based on Lie algebra to neural networks of any… 

Figures from this paper


Orthogonal Convolutional Neural Networks
The proposed orthogonal convolution requires no additional parameters and little computational overhead and consistently outperforms the kernel orthogonality alternative on a wide range of tasks such as image classification and inpainting under supervised, semi-supervised and unsupervised settings.
Unitary Evolution Recurrent Neural Networks
This work constructs an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned, and demonstrates the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies.
Stable Architectures for Deep Neural Networks
New forward propagation techniques inspired by systems of Ordinary Differential Equations (ODE) are proposed that overcome this challenge and lead to well-posed learning problems for arbitrarily deep networks.
On orthogonality and learning recurrent networks with long term dependencies
This paper proposes a weight matrix factorization and parameterization strategy through which the degree of expansivity induced during backpropagation can be controlled and finds that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.
On the Number of Linear Regions of Deep Neural Networks
We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep
On the difficulty of training recurrent neural networks
This paper proposes a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem and validates empirically the hypothesis and proposed solutions.
Orthogonal Recurrent Neural Networks with Scaled Cayley Transform
This work proposes a simpler and novel update scheme to maintain orthogonal recurrent weight matrices without using complex valued matrices by parametrizing with a skew-symmetric matrix using the Cayley transform.
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs
This work presents a new architecture for implementing an Efficient Unitary Neural Network (EUNNs), and finds that this architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed.
Full-Capacity Unitary Recurrent Neural Networks
This work provides a theoretical argument to determine if a unitary parameterization has restricted capacity, and shows how a complete, full-capacity unitary recurrence matrix can be optimized over the differentiable manifold of unitary matrices.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.