• Corpus ID: 245537948

Disentanglement and Generalization Under Correlation Shifts

@article{Funke2021DisentanglementAG,
  title={Disentanglement and Generalization Under Correlation Shifts},
  author={Christina M. Funke and Paul Vicol and Kuan-Chieh Wang and Matthias K{\"u}mmerer and Richard S. Zemel and Matthias Bethge},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.14754}
}
Correlations between factors of variation are prevalent in real-world data. Machine learning algorithms may benefit from exploiting such correlations, as they can increase predictive performance on noisy data. However, often such correlations are not robust (e.g., they may change between domains, datasets, or applications) and we wish to avoid exploiting them. Disentanglement methods aim to learn representations which capture different factors of variation in latent subspaces. A common approach… 

References

SHOWING 1-10 OF 87 REFERENCES
Is Independence all you need? On the Generalization of Representations Learned from Correlated Data
TLDR
This work bridges the gap to real-world scenarios by analyzing the behavior of most prominent methods and disentanglement scores on correlated data in a large scale empirical study (including 3900 models).
Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations
TLDR
The Multi-Level Variational Autoencoder (ML-VAE), a new deep probabilistic model for learning a disentangled representation of a set of grouped observations, separates the latent representation into semantically meaningful parts by working both at the group level and the observation level, while retaining efficient test-time inference.
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
TLDR
This paper theoretically shows that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data, and trains more than 12000 models covering most prominent methods and evaluation metrics on seven different data sets.
Adversarial Disentanglement with Grouped Observations
TLDR
The training objective is augmented to minimize an appropriately defined mutual information term in an adversarial way and the resulting method can efficiently separate the content and style related attributes and generalizes to unseen data.
Weakly Supervised Disentanglement with Guarantees
TLDR
A theoretical framework is provided to assist in analyzing the disentanglement guarantees (or lack thereof) conferred by weak supervision when coupled with learning algorithms based on distribution matching and empirically verify the guarantees and limitations of several weak supervision methods.
A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation
TLDR
Theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data, and investigate concrete benefits of enforcing disentanglement of the learned representations and consider a reproducible experimental setup covering several data sets.
Wasserstein Dependency Measure for Representation Learning
TLDR
It is empirically demonstrated that mutual information-based representation learning approaches do fail to learn complete representations on a number of designed and real-world tasks, and a practical approximation to this theoretically motivated solution, constructed using Lipschitz constraint techniques from the GAN literature, achieves substantially improved results on tasks where incomplete representations are a major challenge.
DIVA: Domain Invariant Variational Autoencoders
TLDR
The Domain Invariant Variational Autoencoder (DIVA) is proposed, a generative model that tackles the problem of domain generalization by learning three independent latent subspaces, one for the domain,One for the class, and one for any residual variations.
Isolating Sources of Disentanglement in Variational Autoencoders
We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables. We use this to motivate our $\beta$-TCVAE (Total Correlation
Adversarial Discriminative Domain Adaptation
TLDR
It is shown that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and the promise of the approach is demonstrated by exceeding state-of-the-art unsupervised adaptation results on standard domain adaptation tasks as well as a difficult cross-modality object classification task.
...
1
2
3
4
5
...