• Corpus ID: 237431305

On the Out-of-distribution Generalization of Probabilistic Image Modelling

  title={On the Out-of-distribution Generalization of Probabilistic Image Modelling},
  author={Mingtian Zhang and Andi Zhang and Steven G. McDonagh},
  booktitle={Neural Information Processing Systems},
Out-of-distribution (OOD) detection and lossless compression constitute two problems that can be solved by the training of probabilistic models on a first dataset with subsequent likelihood evaluation on a second dataset, where data distributions differ. By defining the generalization of probabilistic models in terms of likelihood we show that, in the case of image models, the OOD generalization ability is dominated by local features. This motivates our proposal of a Local Autoregressive model… 

Figures and Tables from this paper

Out-of-Distribution Detection with Class Ratio Estimation

This work proposes to unify density ratio based methods under a novel framework that builds energy-based models and employs differing base distributions and proposes to directly estimate the density ratio of a data sample through class ratio estimation.

Generalization Gap in Amortized Inference

This work proposes a new training objective, inspired by the classic wake-sleep algorithm, to improve the generalizations properties of amortized inference and demonstrates how it can improve generalization performance in the context of image modeling and lossless compression.

Augmenting Softmax Information for Selective Classification with Out-of-Distribution Data

This work examines selective classification in the presence of OOD data (SCOD), and proposes a novel method for SCOD, Softmax Information Retaining Combination (SIRC), that augments softmax-based confidence scores with feature-agnostic information such that their ability to identify OOD samples is improved without sacrificing separation between correct and incorrect ID predictions.

On the Usefulness of Deep Ensemble Diversity for Out-of-Distribution Detection

It is shown that practically, even better OOD detection performance can be achieved for Deep Ensembles by averaging task-specific detection scores such as Energy over the ensemble.

Parallel Neural Local Lossless Compression

This paper proposes two parallelization schemes for local autoregressive models and provides experimental evidence of gains in compression runtime compared to the previous, non-parallel implementation.

Falsehoods that ML researchers believe about OOD detection

A framework, the OOD proxy framework, is proposed, to unify these methods, and it is argued that likelihood ratio is a principled method for OOD detection and not a mere ‘fix’.

Lossy Image Compression with Quantized Hierarchical VAEs

This work redesigns ResNet VAEs using a quantization-aware posterior and prior, enabling easy quantization and entropy coding for image compression and presents a powerful andcient class of lossy image coders, outperforming previous methods on natural image (lossy) compression.

NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling

NIRVANA is proposed, which treats videos as groups of frames and separate networks to each group performing patch-wise prediction, and achieves variable bitrate compression by adapting to videos with varying inter-frame motion.

Improving VAE-based Representation Learning

It is shown that by using a decoder that prefers to learn local features, the remaining global features can be well captured by the latent, which significantly improves performance of a downstream classi-cation task.

Lossless Compression with Probabilistic Circuits

A new class of tractable lossless compression models that permit efficient encoding and decoding: Probabilistic Circuits (PCs), which are a class of neural networks involving |p| computational units that support efficient marginalization over arbitrary subsets of the D feature dimensions, enabling efficient arithmetic coding.



Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

The proposed ODIN method, based on the observation that using temperature scaling and adding small perturbations to the input can separate the softmax score distributions between in- and out-of-distribution images, allowing for more effective detection, consistently outperforms the baseline approach by a large margin.

Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features

Two methods are proposed, first, using the log likelihood ratios of two identical models, one trained on the in-distribution data and the other on a more general distribution of images, which achieve strong anomaly detection performance in the unsupervised setting, reaching comparable performance as state-of-the-art classifier-based methods in the supervised setting.

Why Normalizing Flows Fail to Detect Out-of-Distribution Data

This work demonstrates that flows learn local pixel correlations and generic image-to-latent-space transformations which are not specific to the target image dataset, and shows that by modifying the architecture of flow coupling layers the authors can bias the flow towards learning the semantic structure of the target data, improving OOD detection.

Likelihood Ratios for Out-of-Distribution Detection

This work investigates deep generative model based approaches for OOD detection and observes that the likelihood score is heavily affected by population level background statistics, and proposes a likelihood ratio method forDeep generative models which effectively corrects for these confounding background statistics.

Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices

It is found that characterizing activity patterns by Gram matrices and identifying anomalies in gram matrix values can yield high OOD detection rates, and this method generally performs better than or equal to state-of-the-art Ood detection methods.

Pixel Recurrent Neural Networks

A deep neural network is presented that sequentially predicts the pixels in an image along the two spatial dimensions and encodes the complete set of dependencies in the image to achieve log-likelihood scores on natural images that are considerably better than the previous state of the art.

Understanding the (un)interpretability of natural image distributions using generative models

Methods to extract explicit probability density estimates from GANs, and the properties of these image density functions are described, to show that density functions of natural images are difficult to interpret and thus limited in use.

Input complexity and out-of-distribution detection with likelihood-based generative models

This paper uses an estimate of input complexity to derive an efficient and parameter-free OOD score, which can be seen as a likelihood-ratio, akin to Bayesian model comparison, and finds such score to perform comparably to, or even better than, existing OOD detection approaches under a wide range of data sets, models, model sizes, and complexity estimates.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications

This work discusses the implementation of PixelCNNs, a recently proposed class of powerful generative models with tractable likelihood that contains a number of modifications to the original model that both simplify its structure and improve its performance.