# On the Out-of-distribution Generalization of Probabilistic Image Modelling

@inproceedings{Zhang2021OnTO, title={On the Out-of-distribution Generalization of Probabilistic Image Modelling}, author={Mingtian Zhang and Andi Zhang and Steven G. McDonagh}, booktitle={Neural Information Processing Systems}, year={2021} }

Out-of-distribution (OOD) detection and lossless compression constitute two problems that can be solved by the training of probabilistic models on a first dataset with subsequent likelihood evaluation on a second dataset, where data distributions differ. By defining the generalization of probabilistic models in terms of likelihood we show that, in the case of image models, the OOD generalization ability is dominated by local features. This motivates our proposal of a Local Autoregressive model…

## 16 Citations

### Out-of-Distribution Detection with Class Ratio Estimation

- Computer ScienceArXiv
- 2022

This work proposes to unify density ratio based methods under a novel framework that builds energy-based models and employs differing base distributions and proposes to directly estimate the density ratio of a data sample through class ratio estimation.

### Generalization Gap in Amortized Inference

- Computer ScienceArXiv
- 2022

This work proposes a new training objective, inspired by the classic wake-sleep algorithm, to improve the generalizations properties of amortized inference and demonstrates how it can improve generalization performance in the context of image modeling and lossless compression.

### On the Usefulness of Deep Ensemble Diversity for Out-of-Distribution Detection

- Computer ScienceArXiv
- 2022

It is shown that practically, even better OOD detection performance can be achieved for Deep Ensembles by averaging task-speciﬁc detection scores such as Energy over the ensemble.

### Parallel Neural Local Lossless Compression

- Computer ScienceArXiv
- 2022

This paper proposes two parallelization schemes for local autoregressive models and provides experimental evidence of gains in compression runtime compared to the previous, non-parallel implementation.

### Falsehoods that ML researchers believe about OOD detection

- Computer ScienceArXiv
- 2022

A framework, the OOD proxy framework, is proposed, to unify these methods, and it is argued that likelihood ratio is a principled method for OOD detection and not a mere ‘ﬁx’.

### Lossy Image Compression with Quantized Hierarchical VAEs

- Computer ScienceArXiv
- 2022

This work redesigns ResNet VAEs using a quantization-aware posterior and prior, enabling easy quantization and entropy coding for image compression and presents a powerful andcient class of lossy image coders, outperforming previous methods on natural image (lossy) compression.

### Improving VAE-based Representation Learning

- Computer ScienceArXiv
- 2022

It is shown that by using a decoder that prefers to learn local features, the remaining global features can be well captured by the latent, which signiﬁcantly improves performance of a downstream classi-cation task.

### Lossless Compression with Probabilistic Circuits

- Computer ScienceICLR
- 2022

A new class of tractable lossless compression models that permit efficient encoding and decoding: Probabilistic Circuits (PCs), which are a class of neural networks involving |p| computational units that support efficient marginalization over arbitrary subsets of the D feature dimensions, enabling efficient arithmetic coding.

### Generalizing to Unseen Domains: A Survey on Domain Generalization

- Computer ScienceIJCAI
- 2021

This paper provides a formal definition of domain generalization and discusses several related fields, and categorizes recent algorithms into three classes and present them in detail: data manipulation, representation learning, and learning strategy, each of which contains several popular algorithms.

### PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework

- Computer Science2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2022

PILC is proposed, an end-to-end image lossless compression framework that achieves 200 MB/s for both compression and decom-pression with a single NVIDIA Tesla V100 GPU, 10× faster than the most efficient one before.

## References

SHOWING 1-10 OF 53 REFERENCES

### Multiscale Score Matching for Out-of-Distribution Detection

- Computer ScienceICLR
- 2021

A new methodology for detecting out-of-distribution (OOD) images by utilizing norms of the score estimates at multiple noise scales, which can effectively separate CIFAR-10 and SVHN images, a setting which has been previously shown to be difficult for deep likelihood models.

### Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

- Computer ScienceICLR
- 2018

The proposed ODIN method, based on the observation that using temperature scaling and adding small perturbations to the input can separate the softmax score distributions between in- and out-of-distribution images, allowing for more effective detection, consistently outperforms the baseline approach by a large margin.

### Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features

- Computer ScienceNeurIPS
- 2020

Two methods are proposed, first, using the log likelihood ratios of two identical models, one trained on the in-distribution data and the other on a more general distribution of images, which achieve strong anomaly detection performance in the unsupervised setting, reaching comparable performance as state-of-the-art classifier-based methods in the supervised setting.

### Detecting Out-of-Distribution Inputs to Deep Generative Models Using a Test for Typicality

- Computer ScienceArXiv
- 2019

This work proposes a statistically principled, easy-to-implement test using the empirical distribution of model likelihoods to determine whether or not inputs reside in the typical set.

### Why Normalizing Flows Fail to Detect Out-of-Distribution Data

- Computer ScienceNeurIPS
- 2020

This work demonstrates that flows learn local pixel correlations and generic image-to-latent-space transformations which are not specific to the target image dataset, and shows that by modifying the architecture of flow coupling layers the authors can bias the flow towards learning the semantic structure of the target data, improving OOD detection.

### Likelihood Ratios for Out-of-Distribution Detection

- Computer ScienceNeurIPS
- 2019

This work investigates deep generative model based approaches for OOD detection and observes that the likelihood score is heavily affected by population level background statistics, and proposes a likelihood ratio method forDeep generative models which effectively corrects for these confounding background statistics.

### Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices

- Computer ScienceArXiv
- 2019

It is found that characterizing activity patterns by Gram matrices and identifying anomalies in gram matrix values can yield high OOD detection rates, and this method generally performs better than or equal to state-of-the-art Ood detection methods.

### Pixel Recurrent Neural Networks

- Computer ScienceICML
- 2016

A deep neural network is presented that sequentially predicts the pixels in an image along the two spatial dimensions and encodes the complete set of dependencies in the image to achieve log-likelihood scores on natural images that are considerably better than the previous state of the art.

### Understanding the (un)interpretability of natural image distributions using generative models

- Computer ScienceArXiv
- 2019

Methods to extract explicit probability density estimates from GANs, and the properties of these image density functions are described, to show that density functions of natural images are difficult to interpret and thus limited in use.

### Input complexity and out-of-distribution detection with likelihood-based generative models

- Computer ScienceICLR
- 2020

This paper uses an estimate of input complexity to derive an efficient and parameter-free OOD score, which can be seen as a likelihood-ratio, akin to Bayesian model comparison, and finds such score to perform comparably to, or even better than, existing OOD detection approaches under a wide range of data sets, models, model sizes, and complexity estimates.