• Corpus ID: 218487431

Sum-Product-Transform Networks: Exploiting Symmetries using Invertible Transformations

  title={Sum-Product-Transform Networks: Exploiting Symmetries using Invertible Transformations},
  author={Tom{\'a}{\vs} Pevn{\'y} and V{\'a}clav {\vS}m{\'i}dl and M. Trapp and Ondrej Pol{\'a}cek and Tom{\'a}{\vs} Oberhuber},
In this work, we propose Sum-Product-Transform Networks (SPTN), an extension of sum-product networks that uses invertible transformations as additional internal nodes. The type and placement of transformations determine properties of the resulting SPTN with many interesting special cases. Importantly, SPTN with Gaussian leaves and affine transformations pose the same inference task tractable that can be computed efficiently in SPNs. We propose to store affine transformations in their SVD… 

Figures and Tables from this paper


This work presents a novel approach for training invertible linear layers by train rank-one perturbations and add them to the actual weight matrices infrequently, which allows keeping track of inverses and determinants without ever explicitly computing them.

Training Invertible Linear Layers through Rank-One Perturbations.

This work presents a novel approach for training invertible linear layers by train rank-one perturbations and add them to the actual weight matrices infrequently, which allows keeping track of inverses and determinants without ever explicitly computing them.

Training Neural Networks with Property-Preserving Parameter Perturbations

This work presents a novel, general approach of preserving matrix properties by using parameterized perturbations in lieu of directly optimizing the network parameters, and shows how such invertible blocks improve the mixing of coupling layers and thus the mode separation of the resulting normalizing flows.

Preserving Properties of Neural Networks by Perturbative Updates

This work presents a novel, general approach to preserve network properties by using parameterized perturbations, and shows how such invertible blocks improve mode separation when applied to normalizing flows and Boltzmann generators.

Fitting large mixture models using stochastic component selection

A combination of the expectation maximization and the Metropolis-Hastings algorithm is proposed to apply to evaluate only a small number of, stochastically sampled, components of the mixture, thus substantially reducing the computational cost.

Deep Residual Mixture Models

We propose Deep Residual Mixture Models (DRMMs) which share the many desirable properties of Gaussian Mixture Models (GMMs), but with a crucial benefit: The modeling capacity of a DRMM can grow

Comparison of Anomaly Detectors: Context Matters

It is identified that the main sources of variability are the experimental conditions: 1) the type of dataset and the nature of anomalies and 2) strategy of selection of hyperparameters, especially the number of available anomalies in the validation set.



Sum-Product-Quotient Networks

It is proved that there are distributions which SPQNs can compute efficiently but require SPNs to be of exponential size, which narrow the gap in expressivity between tractable graphical models and other Neural Network-based generative models.

On Theoretical Properties of Sum-Product Networks

It is shown that the weights of any complete and consistent SPN can be transformed into locally normalized weights without changing the SPN distribution, and that consistent SPNs cannot model distributions significantly (exponentially) more compactly than decomposable SPNs.

Bayesian Learning of Sum-Product Networks

A well-principled Bayesian framework for SPN structure learning, which consistently and robustly learns SPN structures under missing data, and a natural parametrisation for an important and widely used special case of SPNs.

Collapsed Variational Inference for Sum-Product Networks

This work proposes a novel deterministic collapsed variational inference algorithm for SPNs that is computationally efficient, easy to implement and at the same time allows us to incorporate prior information into the optimization formulation.

Improving Variational Auto-Encoders using Householder Flow

This paper proposes a volume-preserving flow that uses a series of Householder transformations that allows to obtain more flexible variational posterior and competitive results comparing to other normalizing flows.

Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and Unitary Group

A novel approach to perform first-order optimization with orthogonal and unitary constraints based on a parametrization stemming from Lie group theory through the exponential map is introduced, showing faster, accurate, and more stable convergence in several tasks designed to test RNNs.

Unitary Evolution Recurrent Neural Networks

This work constructs an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned, and demonstrates the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies.

Simplifying, Regularizing and Strengthening Sum-Product Network Structure Learning

This work enhances one of the best structure learner, LearnSPN, aiming to improve both the structural quality of the learned networks and their achieved likelihoods, and proves its claims by empirically evaluating the learned SPNs on several benchmark datasets against other competitive SPN and PGM structure learners.

Learning Graph-Structured Sum-Product Networks for Probabilistic Semantic Maps

This work demonstrates how GraphSPNs can be used to bolster inference about semantic, conceptual place descriptions using noisy topological relations discovered by a robot exploring large-scale office spaces, and exploits the probabilistic nature of the model to infer marginal distributions over semantic descriptions of as yet unexplored places.

MADE: Masked Autoencoder for Distribution Estimation

This work introduces a simple modification for autoencoder neural networks that yields powerful generative models and proves that this approach is competitive with state-of-the-art tractable distribution estimators.