Identifiability of deep generative models under mixture priors without auxiliary information

  title={Identifiability of deep generative models under mixture priors without auxiliary information},
  author={Bohdan Kivva and Goutham Rajendran and Pradeep Ravikumar and Bryon Aragam},
We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational autoencoders that are commonly used in practice. Unlike existing work, our analysis does not require weak supervision, auxiliary information, or conditioning in the latent space. Recently, there has been a surge of works studying identifiability of such models. In these works, the main assumption is that along with the data, an… 

Figures and Tables from this paper

Generalized Identifiability Bounds for Mixture Models with Grouped Samples

It is shown that, if every subset of k mixture components of a mixture model are linearly independent, then that mixture model is identifiable with only (2 m − 1) / ( k −1) samples per group, and that this value cannot be improved.



Variational Autoencoders and Nonlinear ICA: A Unifying Framework

This work shows that for a broad family of deep latent-variable models, identification of the true joint distribution over observed and latent variables is actually possible up to very simple transformations, thus achieving a principled and powerful form of disentanglement.

I Don't Need u: Identifiable Non-Linear ICA Without Side Information

Surprisingly, it is found side information is not necessary for algorithmic stability: using standard quantitative measures of identifiability, deep generative models with latent clusterings are empirically identi fiable to the same degree as models which rely on auxiliary labels.

An Identifiable Double VAE For Disentangled Representations

This work proposes a novel VAE-based generative model with theoretical guarantees on identifiability, and obtains its conditional prior over the latents by learning an optimal representation, which imposes an additional strength on their regularization.

VAE with a VampPrior

This paper proposes to extend the variational auto-encoder (VAE) framework with a new type of prior called "Variational Mixture of Posteriors" prior, or VampPrior for short, which consists of a mixture distribution with components given by variational posteriors conditioned on learnable pseudo-inputs.

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

This paper investigates posterior collapse from the perspective of training dynamics and proposes an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, the inference network is optimized before performing each model update.

Identifiable Variational Autoencoders via Sparse Decoding

The Sparse VAE is identifiable: given data drawn from the model, there exists a uniquely optimal set of factors, and it is found that it recovers meaningful latent factors and has smaller heldout reconstruction error than related methods.

Visual Representation Learning Does Not Generalize Strongly Within the Same Domain

This paper test whether 17 unsupervised, weakly supervised, and fully supervised representation learning approaches correctly infer the generative factors of variation in simple datasets and observe that all of them struggle to learn the underlying mechanism regardless of supervision signal and architectural bias.

Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering

Variational Deep Embedding (VaDE) is proposed, a novel unsupervised generative clustering approach within the framework of Variational Auto-Encoder (VAE), which shows its capability of generating highly realistic samples for any specified cluster, without using supervised information during training.

Learning Latent Subspaces in Variational Autoencoders

A VAE-based generative model is proposed which is capable of extracting features correlated to binary labels in the data and structuring it in a latent subspace which is easy to interpret and demonstrate the utility of the learned representations for attribute manipulation tasks on both the Toronto Face and CelebA datasets.

ICE-BeeM: Identifiable Conditional Energy-Based Deep Models

This paper establishes sufficient conditions under which a large family of conditional energy-based models is identifiable in function space, up to a simple transformation, and proposes the framework of independently modulated component analysis (IMCA), a new form of nonlinear ICA where the indepencence assumption is relaxed.