• Corpus ID: 220042270

Tensor Networks for Probabilistic Sequence Modeling

@inproceedings{Miller2021TensorNF,
  title={Tensor Networks for Probabilistic Sequence Modeling},
  author={Jacob Miller and Guillaume Rabusseau and John Terilla},
  booktitle={AISTATS},
  year={2021}
}
Tensor networks are a powerful modeling framework developed for computational many-body physics, which have only recently been applied within machine learning. In this work we utilize a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We first show that u-MPS enable sequence-level parallelism, with length-n sequences able to be evaluated in depth O(log n). We then introduce a novel generative algorithm giving trained u-MPS the ability to efficiently sample… 

Figures and Tables from this paper

Adaptive Tensor Learning with Tensor Networks
TLDR
A generic and efficient adaptive algorithm for tensor learning based on a simple greedy approach optimizing a differentiable loss function starting from a rank one tensor and successively identifying the most promising tensor network edges for small rank increments.
Quantum Tensor Networks, Stochastic Processes, and Weighted Automata
TLDR
This work shows how stationary or uniform versions of popular quantum tensor network models have equivalent representations in the stochastic processes and weighted automata literature, in the limit of infinitely long sequences.
Permutation Search of Tensor Network Structures via Local Sampling
TLDR
Theoretically, the counting and metric properties of search spaces of TN-PS are proved and a novel meta-heuristic algorithm is proposed, in which the searching is done by randomly sampling in a neighborhood established in the authors' theory, and then recurrently updating the neighborhood until convergence.
Tensor Train for Global Optimization Problems in Robotics
TLDR
The generality of the framework and its relevance to robotics is demonstrated by applying the proposed method to inverse kinematics and motion planning problems with a 7-DoF manipulator.
Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs
TLDR
This work uses tensor rank decomposition (aka. CPD) to decrease inference computational complexities for a subset of FGGs subsuming HMMs and PCFGs, and conducts experiments on HMM language modeling and unsupervised PCFG parsing, showing better performance.
Improvements to Gradient Descent Methods for Quantum Tensor Network Machine Learning
TLDR
A ‘copy node’ method is introduced that successfully initializes arbitrary tensor networks, in addition to a gradient based regularization technique for bond dimensions that produces quantum-inspired tensor network models with far fewer parameters, while improving generalization performance.
Evaluating Generalization in Classical and Quantum Generative Models
TLDR
Using the sample-based generalization metrics proposed here, any generative model, from state-of-the-art classical generative models such as GANs to quantum models, can be evaluated on the same ground on a concrete well-defined framework, and these metrics can diagnose trainability issues such as mode collapse and over⬁tting.
Explainable natural language processing with matrix product states
TLDR
Contrary to a common belief that long-range information propagation is the main source of RNNs’ successes, it is shown that single-layer RACs harness high expressiveness from the subtle interplay between the information propagation and the word vector embeddings.
An enriched category theory of language: from syntax to semantics
TLDR
This paper proposes a mathematical framework for passing from probability distributions on extensions of given texts to an enriched category containing semantic information, which is a category enriched over the unit interval.
Lower and Upper Bounds on the Pseudo-Dimension of Tensor Network Models
TLDR
Upper and lower bounds on the VC-dimension and pseudo-dimension of a large class of TN models for classification, regression and completion are derived and a generalization bound is derived which can be applied to classification with low-rank matrices as well as linear classifiers based on any of the commonly used tensor decomposition models.
...
...

References

SHOWING 1-10 OF 67 REFERENCES
Tree Tensor Networks for Generative Modeling
TLDR
It is shown that the TTN is superior to MPSs for generative modeling in keeping the correlation of pixels in natural images, as well as giving better log-likelihood scores in standard data sets of handwritten digits.
From Probabilistic Graphical Models to Generalized Tensor Networks for Supervised Learning
TLDR
This work explores the connection between tensor networks and probabilistic graphical models, and shows that it motivates the definition of generalized Tensor networks where information from a tensor can be copied and reused in other parts of the network.
Unsupervised Generative Modeling Using Matrix Product States
TLDR
This work proposes a generative model using matrix product states, which is a tensor network originally proposed for describing (particularly one-dimensional) entangled quantum states, and enjoys efficient learning analogous to the density matrix renormalization group method.
Supervised learning with generalized tensor networks
TLDR
This work explores the connection between tensor networks and probabilistic graphical models, and shows that it motivates the definition of generalized Tensor networks where information from a tensor can be copied and reused in other parts of the network.
On the Expressive Power of Deep Learning: A Tensor Analysis
TLDR
It is proved that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network.
Tensorizing Neural Networks
TLDR
This paper converts the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved.
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Differentiable Programming Tensor Networks
TLDR
This work presents essential techniques to differentiate through the tensor networks contractions, including stable AD for tensor decomposition and efficient backpropagation through fixed point iterations, and removes laborious human efforts in deriving and implementing analytical gradients for Tensor network programs.
Bidirectional Recurrent Neural Networks as Generative Models
TLDR
This work proposes two probabilistic interpretations of bidirectional RNNs that can be used to reconstruct missing gaps efficiently and provides results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods.
Exponential Machines
TLDR
This paper introduces Exponential Machines (ExM), a predictor that models all interactions of every order in a factorized format called Tensor Train (TT), and shows that the model achieves state-of-the-art performance on synthetic data with high-order interactions and works on par on a recommender system dataset MovieLens 100K.
...
...