# On the Difference between the Information Bottleneck and the Deep Information Bottleneck

@article{Wieczorek2020OnTD, title={On the Difference between the Information Bottleneck and the Deep Information Bottleneck}, author={Aleksander Wieczorek and Volker Roth}, journal={Entropy}, year={2020}, volume={22} }

Combining the information bottleneck model with deep learning by replacing mutual information terms with deep neural nets has proven successful in areas ranging from generative modelling to interpreting deep neural networks. In this paper, we revisit the deep variational information bottleneck and the assumptions needed for its derivation. The two assumed properties of the data, X and Y, and their latent representation T, take the form of two Markov chains T−X−Y and X−T−Y. Requiring both to…

## 9 Citations

On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views

- Computer Science, MathematicsEntropy
- 2020

This tutorial paper focuses on the variants of the bottleneck problem taking an information theoretic perspective and discusses practical methods to solve it, as well as its connection to coding and…

Information Bottleneck Analysis by a Conditional Mutual Information Bound

- Computer Science, MedicineEntropy
- 2021

It is demonstrated that conditional mutual information I(z;x|y) provides an alternative upper bound for I( z;n), and this bound is applicable even if z is not a sufficient representation of x, that is, I(Z;y)≠I(x;y).

A Comparison of Variational Bounds for the Information Bottleneck Functional

- Medicine, Computer ScienceEntropy
- 2020

This work tries to shed light on the variational bounds proposed in Alemi et al. (2017) and Fischer (2020) for the information bottleneck (IB) and the conditional entropy bottleneck (CEB) functional by showing that, in the most general setting, no ordering can be established between these Variational bounds.

Learning Conditional Invariance through Cycle Consistency

- Computer Science, MathematicsGCPR
- 2021

This work proposes a novel approach to cycle consistency based on the deep information bottleneck and, in contrast to other approaches, allows using continuous target properties and provides inherent model selection capabilities.

On Learning Prediction-Focused Mixtures

- Computer Science, MathematicsArXiv
- 2021

This work introduces prediction-focused modeling for mixtures, which automatically selects the dimensions relevant to the prediction task and identifies relevant signal from the input, outperforms models that are not prediction- focused, and is easy to optimize.

Inverse Learning of Symmetry Transformations

- Computer Science, MathematicsArXiv
- 2020

This work proposes learning two latent subspaces, where the first subspace captures the property and the second subspace the remaining invariant information, based on the deep information bottleneck principle in combination with a mutual information regulariser.

Information Bottleneck for Estimating Treatment Effects with Systematically Missing Covariates

- Computer Science, MedicineEntropy
- 2020

This paper trains an information bottleneck to perform a low-dimensional compression of covariates by explicitly considering the relevance of information for treatment effects and can reliably and accurately estimate treatment effects even in the absence of a full set of covariate information at test time.

Prediction-focused Mixture Models

- 2021

This work introduces the prediction-focused mixture model, which selects and models input features relevant to predicting the targets and demonstrates that this approach identifies relevant signal from inputs even when the model is highly misspecified.

Inverse Learning of Symmetries

- Computer ScienceNeurIPS
- 2020

This work proposes to learn the symmetry transformation with a model consisting of two latent subspaces, where the first subspace captures the target and the second subspace the remaining invariant information, based on the deep information bottleneck in combination with a continuous mutual information regulariser.

## References

SHOWING 1-10 OF 30 REFERENCES

Deep learning and the information bottleneck principle

- Computer Science, Mathematics2015 IEEE Information Theory Workshop (ITW)
- 2015

It is argued that both the optimal architecture, number of layers and features/connections at each layer, are related to the bifurcation points of the information bottleneck tradeoff, namely, relevant compression of the input layer with respect to the output layer.

On the Information Bottleneck Theory of Deep Learning

- Computer ScienceICLR
- 2018

This paper presents a comprehensive theory of large scale learning with Deep Neural Networks (DNN), when optimized with Stochastic Gradient Decent (SGD), built on three theoretical components.

Information Dropout: Learning Optimal Representations Through Noisy Computation

- Computer Science, MedicineIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2018

It is proved that Information Dropout achieves a comparable or better generalization performance than binary dropout, especially on smaller models, since it can automatically adapt the noise to the structure of the network, as well as to the test sample.

An Information-Theoretic Analysis of Deep Latent-Variable Models

- Computer Science, MathematicsArXiv
- 2017

An information-theoretic framework for understanding trade-offs in unsupervised learning of deep latent-variables models using variational inference and how this framework sheds light on many recent proposed extensions to the variational autoencoder family is presented.

Opening the Black Box of Deep Neural Networks via Information

- Computer ScienceArXiv
- 2017

This work demonstrates the effectiveness of the Information-Plane visualization of DNNs and shows that the training time is dramatically reduced when adding more hidden layers, and the main advantage of the hidden layers is computational.

Emergence of Invariance and Disentanglement in Deep Representations

- Computer Science, Mathematics2018 Information Theory and Applications Workshop (ITA)
- 2018

It is shown that in a deep neural network invariance to nuisance factors is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations.

InfoVAE: Balancing Learning and Inference in Variational Autoencoders

- Computer ScienceAAAI
- 2019

It is shown that the proposed Info-VAE model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution.

Learning Sparse Latent Representations with the Deep Copula Information Bottleneck

- Computer Science, MathematicsICLR
- 2018

This paper adopts the deep information bottleneck model, identifies its shortcomings and proposes a model that circumvents them, and applies a copula transformation which restores the invariance properties of the information bottleneck method and leads to disentanglement of the features in the latent space.

Gaussian Lower Bound for the Information Bottleneck Limit

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2017

A Gaussian lower bound to the IB curve is introduced and it is shown that the optimal Gaussian embedding is bounded from above by non-linear CCA, which allows a fundamental limit for the ability to Gaussianize arbitrary data-sets and solve complex problems by linear methods.

Fixing a Broken ELBO

- Computer Science, MathematicsICML
- 2018

This framework derives variational lower and upper bounds on the mutual information between the input and the latent variable, and uses these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy.