# Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

@article{Jing2017TunableEU, title={Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs}, author={Li Jing and Yichen Shen and T. Dub{\vc}ek and J. Peurifoy and S. Skirlo and Y. LeCun and Max Tegmark and M. Solja{\vc}i{\'c}}, journal={ArXiv}, year={2017}, volume={abs/1612.05231} }

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. [...] Key Method Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of… Expand

#### Supplemental Content

#### 125 Citations

Complex Evolution Recurrent Neural Networks (ceRNNs)

- Computer Science, Engineering
- 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018

In a large scale real-world speech recognition, it is found that pre-pending a uRNN degrades the performance of the baseline LSTM acoustic models, while pre-Pending a ceRNN improves the performance over the baseline by 0.8% absolute WER. Expand

Improving the Performance of Unitary Recurrent Neural Networks and Their Application in Real-life Tasks

- Computer Science
- CompSysTech
- 2018

This work uses a recent approach for a recurrent neural network model implementing a unitary matrix in its recurrent connection to deal with long-term dependencies, without affecting its memory abilities, and achieves time performance up to 5 times better than the original implementation. Expand

Orthogonal Recurrent Neural Networks with Scaled Cayley Transform

- Mathematics, Computer Science
- ICML
- 2018

This work proposes a simpler and novel update scheme to maintain orthogonal recurrent weight matrices without using complex valued matrices by parametrizing with a skew-symmetric matrix using the Cayley transform. Expand

Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections

- Computer Science, Mathematics
- ICML
- 2017

A new parametrisation of the transition matrix is presented which allows efficient training of an RNN while ensuring that the matrix is always orthogonal, and gives similar benefits to the unitary constraint, without the time complexity limitations. Expand

Fast orthogonal recurrent neural networks employing a novel parametrisation for orthogonal matrices

- Computer Science
- Signal Process.
- 2019

The algorithms of this paper are fast, tunable, and full-capacity, where the target variable is updated by optimizing a matrix multiplier, instead of using the explicit gradient descent. Expand

Input-Output Equivalence of Unitary and Contractive RNNs

- Computer Science, Mathematics
- NeurIPS
- 2019

It is shown that for any contractive RNN with ReLU activations, there is a URNN with at most twice the number of hidden states and the identical input-output mapping, and URNNs are as expressive as general RNNs. Expand

Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform

- Computer Science, Mathematics
- AAAI
- 2019

In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix. Expand

Acceleration Method for Learning Fine-Layered Optical Neural Networks

- Computer Science, Physics
- 2021

An optical neural network (ONN) is a promising system due to its high-speed and low-power operation. Its linear unit performs a multiplication of an input vector and a weight matrix in optical analog… Expand

Rotational Unit of Memory

- Computer Science, Mathematics
- ICLR
- 2018

A novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM), which is, naturally, a unitary matrix providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Expand

Gated Orthogonal Recurrent Units: On Learning to Forget

- Computer Science, Mathematics
- Neural Computation
- 2019

We present a novel recurrent neural network (RNN)–based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or… Expand

#### References

SHOWING 1-10 OF 32 REFERENCES

Unitary Evolution Recurrent Neural Networks

- Computer Science, Mathematics
- ICML
- 2016

This work constructs an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned, and demonstrates the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies. Expand

Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections

- Computer Science, Mathematics
- ICML
- 2017

A new parametrisation of the transition matrix is presented which allows efficient training of an RNN while ensuring that the matrix is always orthogonal, and gives similar benefits to the unitary constraint, without the time complexity limitations. Expand

Full-Capacity Unitary Recurrent Neural Networks

- Computer Science, Mathematics
- NIPS
- 2016

This work provides a theoretical argument to determine if a unitary parameterization has restricted capacity, and shows how a complete, full-capacity unitary recurrence matrix can be optimized over the differentiable manifold of unitary matrices. Expand

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

- Computer Science
- ArXiv
- 2015

This paper proposes a simpler solution that use recurrent neural networks composed of rectified linear units that is comparable to LSTM on four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem. Expand

Recurrent Orthogonal Networks and Long-Memory Tasks

- Computer Science, Mathematics
- ICML
- 2016

This work carefully analyzes two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps and explicitly construct RNN solutions to these problems. Expand

Bidirectional Recurrent Neural Networks as Generative Models

- Computer Science
- NIPS
- 2015

This work proposes two probabilistic interpretations of bidirectional RNNs that can be used to reconstruct missing gaps efficiently and provides results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods. Expand

Sequence to Sequence Learning with Neural Networks

- Computer Science
- NIPS
- 2014

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand

Orthogonal RNNs and Long-Memory Tasks

- Computer Science
- ArXiv
- 2016

This work carefully analyzes two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps and explicitly construct RNN solutions to these problems. Expand

ImageNet classification with deep convolutional neural networks

- Computer Science
- Commun. ACM
- 2012

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand

Long-term recurrent convolutional networks for visual recognition and description

- Computer Science, Medicine
- 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015

A novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and shows such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized. Expand