# Higher Order Statistical Decorrelation without Information Loss

@inproceedings{Deco1994HigherOS, title={Higher Order Statistical Decorrelation without Information Loss}, author={Gustavo Deco and Wilfried Brauer}, booktitle={NIPS}, year={1994} }

A neural network learning paradigm based on information theory is proposed as a way to perform in an unsupervised fashion, redundancy reduction among the elements of the output layer without loss of information from the sensory input. The model developed performs nonlinear decorrelation up to higher orders of the cumulant tensors and results in probabilistically independent components of the output layer. This means that we don't need to assume Gaussian distribution neither at the input nor at…

## 39 Citations

### Decomposing neural networks as mappings of correlation functions

- Computer SciencePhysical Review Research
- 2022

The mapping between probability distributions implemented by a deep feed-forward network is studied as an iterated transformation of distributions, where the non-linearity in each layer transfers information between diﬀerent orders of correlation functions to identify essential statistics in the data.

### Nonparametric Data Selection for Improvement of Parametric Neural Learning: A Cumulant-Surrogate Method

- Computer ScienceICANN
- 1996

A nonparametric cumulant based statistical approach for detecting linear and nonlinear statistical dependences in non-stationary time series and measuring the predictability which tests the null hypothesis of statistical independence by the surrogate method is introduced.

### Improving Variational Autoencoders with Inverse Autoregressive Flow

- Computer ScienceNIPS
- 2016

In experiments with natural images, it is demonstrated that autoregressive flow leads to significant performance gains and is well applicable to models with high-dimensional latent spaces, such as convolutional generative models.

### Learning Bijective Feature Maps for Linear ICA

- Computer ScienceAISTATS
- 2021

This paper develops a method that jointly learns a linear independent component analysis model with non-linear bijective feature maps that achieves better unsupervised latent factor discovery than flow-based models and linear ICA on large image datasets.

### Integer Discrete Flows and Lossless Compression

- Computer ScienceNeurIPS
- 2019

This work introduces a flow-based generative model for ordinal discrete data called Integer Discrete Flow (IDF): a bijective integer map that can learn rich transformations on high-dimensional data and introduces a flexible transformation layer called integer discrete coupling.

### The Reversible Residual Network: Backpropagation Without Storing Activations

- Computer ScienceNIPS
- 2017

The Reversible Residual Network (RevNet) is presented, a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's, therefore, the activations for most layers need not be stored in memory during backpropagation.

### General Probabilistic Surface Optimization and Log Density Estimation

- Computer ScienceArXiv
- 2019

A novel algorithm family, which generalizes many unsupervised techniques including unnormalized and energy models, and allows us to infer different statistical modalities from data samples, and derives new PSO-based inference methods as demonstration of PSO exceptional usability.

### UvA-DARE (Digital Academic Repository) Improving Variational Autoencoders with Inverse Autoregressive Flow Improved Variational Inference with Inverse Autoregressive Flow

- Computer Science
- 2016

It is demonstrated that a novel type of variational autoencoder, coupled with IAF, is competitive with neural autoregressive models in terms of attained log-likelihood on natural images, while allowing signiﬁcantly faster synthesis.

### Multi-scale Attention Flow for Probabilistic Time Series Forecasting

- Computer ScienceArXiv
- 2022

This work proposed a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow (MANF), where it combines multi-scale attention with relative position information and the multivariate data distribution is represented by the conditioned normalizing flow.

### Improving Variational Autoencoders with Inverse Autoregressive Flow

- Computer Science
- 2016

A new type of normalizing flow, inverse autoregressive flow (IAF), is proposed that, in contrast to earlier published flows, scales well to high-dimensional latent spaces and significantly improves upon diagonal Gaussian approximate posteriors.

## References

SHOWING 1-3 OF 3 REFERENCES

### Supervised Factorial Learning

- Computer ScienceNeural Computation
- 1993

This work lends support to Barlow's argument for factorial sensory processing, by demonstrating how it can solve actual pattern recognition problems, and two techniques for supervised factorial learning are explored, one of which gives a novel distributed solution requiring only positive examples.

### Unsupervised Learning. Neural Computation, 1,295-311

- A. Papoulis
- 1989