# Expressive power of recurrent neural networks

@article{Khrulkov2018ExpressivePO, title={Expressive power of recurrent neural networks}, author={Valentin Khrulkov and Alexander Novikov and I. Oseledets}, journal={ArXiv}, year={2018}, volume={abs/1711.00811} }

Deep neural networks are surprisingly efficient at solving practical tasks, but the theory behind this phenomenon is only starting to catch up with the practice. Numerous works show that depth is the key to this efficiency. A certain class of deep convolutional networks -- namely those that correspond to the Hierarchical Tucker (HT) tensor decomposition -- has been proven to have exponentially higher expressive power than shallow networks. I.e. a shallow network of exponential width is required…

## 85 Citations

Generalized Tensor Models for Recurrent Neural Networks

- Computer ScienceICLR
- 2019

This work attempts to reduce the gap between theory and practice by extending the theoretical analysis to RNNs which employ various nonlinearities, such as Rectified Linear Unit (ReLU), and shows that they also benefit from properties of universality and depth efficiency.

Tucker Decomposition Network: Expressive Power and Comparison

- Computer ScienceArXiv
- 2019

The main contribution of this paper is to develop a deep network based on Tucker tensor decomposition, and analyze its expressive power.

On the Memory Mechanism of Tensor-Power Recurrent Models

- Computer ScienceAISTATS
- 2021

This work proves that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors, and extends the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets.

Depth Enables Long-Term Memory for Recurrent Neural Networks

- Computer ScienceArXiv
- 2020

It is proved that deep recurrent networks support Start-End separation ranks which are combinatorially higher than those supported by their shallow counterparts, and established that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies.

Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning

- Computer ScienceMachine Learning
- 2022

In this paper, we present connections between three models used in different research fields: weighted finite automata~(WFA) from formal languages and linguistics, recurrent neural networks used in…

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

- Computer ScienceICML
- 2022

Inspired by the theory, explicit regularization discouraging locality is designed and demonstrated its ability to improve the performance of modern convolutional networks on non-local tasks, in deﬁance of conventional wisdom by which architectural changes are needed.

On the Long-Term Memory of Deep Recurrent Networks

- Computer Science
- 2017

It is established that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies, and an exemplar of quantifying this key attribute which may be readily extended to other RNN architectures of interest, e.g. variants of LSTM networks.

Compact Neural Architecture Designs by Tensor Representations

- Computer ScienceFrontiers in Artificial Intelligence
- 2022

A framework of tensorial neural networks (TNNs) extending existing linear layers on low- order tensors to multilinear operations on higher-order tensors is proposed, demonstrating that TNNs outperform the state-of-the-art low-rank methods on a wide range of backbone networks and datasets.

Adaptive Learning of Tensor Network Structures

- Computer Science
- 2020

This work develops a generic and efficient adaptive algorithm to jointly learn the structure and the parameters of a TN from data that outperforms the state-of-the-art evolutionary topology search introduced in [18] for tensor decomposition of images and finds efficient structures to compress neural networks outperforming popular TT based approaches.

Tensor-Train Recurrent Neural Networks for Interpretable Multi-Way Financial Forecasting

- Computer Science2021 International Joint Conference on Neural Networks (IJCNN)
- 2021

It is shown, through the analysis of TT-factors, that the physical meaning underlying tensor decomposition, enables the TT-RNN model to aid the interpretability of results, thus mitigating the notorious “black-box” issue associated with neural networks.

## References

SHOWING 1-10 OF 33 REFERENCES

On the Expressive Power of Deep Learning: A Tensor Analysis

- Computer ScienceCOLT 2016
- 2015

It is proved that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network.

Convolutional Rectifier Networks as Generalized Tensor Decompositions

- Computer ScienceICML
- 2016

Developing effective methods for training convolutional arithmetic circuits may give rise to a deep learning architecture that is provably superior to Convolutional rectifier networks, which has so far been overlooked by practitioners.

On the Expressive Power of Deep Neural Networks

- Computer ScienceICML
- 2017

We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute.…

Opening the Black Box of Deep Neural Networks via Information

- Computer ScienceArXiv
- 2017

This work demonstrates the effectiveness of the Information-Plane visualization of DNNs and shows that the training time is dramatically reduced when adding more hidden layers, and the main advantage of the hidden layers is computational.

On the Number of Linear Regions of Deep Neural Networks

- Computer ScienceNIPS
- 2014

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep…

Deep Residual Learning for Image Recognition

- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

On the Expressive Efficiency of Sum Product Networks

- Computer ScienceArXiv
- 2014

A result is established which establishes the existence of a relatively simple distribution with fully tractable marginal densities which cannot be efficiently captured by D&C SPNs of any depth, but which can be efficiently capture by various other deep generative models.

Shallow vs. Deep Sum-Product Networks

- Computer ScienceNIPS
- 2011

It is proved there exist families of functions that can be represented much more efficiently with a deep network than with a shallow one, i.e. with substantially fewer hidden units.

On the importance of initialization and momentum in deep learning

- Computer ScienceICML
- 2013

It is shown that when stochastic gradient descent with momentum uses a well-designed random initialization and a particular type of slowly increasing schedule for the momentum parameter, it can train both DNNs and RNNs to levels of performance that were previously achievable only with Hessian-Free optimization.

Speech recognition with deep recurrent neural networks

- Computer Science2013 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2013

This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.