Corpus ID: 5287947

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

@article{Jing2017TunableEU,
  title={Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs},
  author={Li Jing and Yichen Shen and T. Dub{\vc}ek and J. Peurifoy and S. Skirlo and Y. LeCun and Max Tegmark and M. Solja{\vc}i{\'c}},
  journal={ArXiv},
  year={2017},
  volume={abs/1612.05231}
}
Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. [...] Key Method Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of…Expand
Complex Evolution Recurrent Neural Networks (ceRNNs)
TLDR
In a large scale real-world speech recognition, it is found that pre-pending a uRNN degrades the performance of the baseline LSTM acoustic models, while pre-Pending a ceRNN improves the performance over the baseline by 0.8% absolute WER. Expand
Improving the Performance of Unitary Recurrent Neural Networks and Their Application in Real-life Tasks
TLDR
This work uses a recent approach for a recurrent neural network model implementing a unitary matrix in its recurrent connection to deal with long-term dependencies, without affecting its memory abilities, and achieves time performance up to 5 times better than the original implementation. Expand
Orthogonal Recurrent Neural Networks with Scaled Cayley Transform
TLDR
This work proposes a simpler and novel update scheme to maintain orthogonal recurrent weight matrices without using complex valued matrices by parametrizing with a skew-symmetric matrix using the Cayley transform. Expand
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
TLDR
A new parametrisation of the transition matrix is presented which allows efficient training of an RNN while ensuring that the matrix is always orthogonal, and gives similar benefits to the unitary constraint, without the time complexity limitations. Expand
Fast orthogonal recurrent neural networks employing a novel parametrisation for orthogonal matrices
TLDR
The algorithms of this paper are fast, tunable, and full-capacity, where the target variable is updated by optimizing a matrix multiplier, instead of using the explicit gradient descent. Expand
Input-Output Equivalence of Unitary and Contractive RNNs
TLDR
It is shown that for any contractive RNN with ReLU activations, there is a URNN with at most twice the number of hidden states and the identical input-output mapping, and URNNs are as expressive as general RNNs. Expand
Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform
TLDR
In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix. Expand
Acceleration Method for Learning Fine-Layered Optical Neural Networks
An optical neural network (ONN) is a promising system due to its high-speed and low-power operation. Its linear unit performs a multiplication of an input vector and a weight matrix in optical analogExpand
Rotational Unit of Memory
TLDR
A novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM), which is, naturally, a unitary matrix providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Expand
Gated Orthogonal Recurrent Units: On Learning to Forget
We present a novel recurrent neural network (RNN)–based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant orExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 32 REFERENCES
Unitary Evolution Recurrent Neural Networks
TLDR
This work constructs an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned, and demonstrates the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies. Expand
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
TLDR
A new parametrisation of the transition matrix is presented which allows efficient training of an RNN while ensuring that the matrix is always orthogonal, and gives similar benefits to the unitary constraint, without the time complexity limitations. Expand
Full-Capacity Unitary Recurrent Neural Networks
TLDR
This work provides a theoretical argument to determine if a unitary parameterization has restricted capacity, and shows how a complete, full-capacity unitary recurrence matrix can be optimized over the differentiable manifold of unitary matrices. Expand
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
TLDR
This paper proposes a simpler solution that use recurrent neural networks composed of rectified linear units that is comparable to LSTM on four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem. Expand
Recurrent Orthogonal Networks and Long-Memory Tasks
TLDR
This work carefully analyzes two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps and explicitly construct RNN solutions to these problems. Expand
Bidirectional Recurrent Neural Networks as Generative Models
TLDR
This work proposes two probabilistic interpretations of bidirectional RNNs that can be used to reconstruct missing gaps efficiently and provides results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods. Expand
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand
Orthogonal RNNs and Long-Memory Tasks
TLDR
This work carefully analyzes two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps and explicitly construct RNN solutions to these problems. Expand
ImageNet classification with deep convolutional neural networks
TLDR
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand
Long-term recurrent convolutional networks for visual recognition and description
TLDR
A novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and shows such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized. Expand
...
1
2
3
4
...