Equivariant Disentangled Transformation for Domain Generalization under Combination Shift

  title={Equivariant Disentangled Transformation for Domain Generalization under Combination Shift},
  author={Yivan Zhang and Jindong Wang and Xingxu Xie and Masashi Sugiyama},
Machine learning systems may encounter unexpected problems when the data distribution changes in the deployment environment. A major reason is that certain combinations of domains and labels are not observed during training but appear in the test environment. Although various invariance-based algorithms can be applied, we find that the performance gain is often marginal. To formally analyze this issue, we provide a unique algebraic formulation of the combination shift problem based on the… 

Figures and Tables from this paper



Visual Representation Learning Does Not Generalize Strongly Within the Same Domain

This paper test whether 17 unsupervised, weakly supervised, and fully supervised representation learning approaches correctly infer the generative factors of variation in simple datasets and observe that all of them struggle to learn the underlying mechanism regardless of supervision signal and architectural bias.

Generalizing to unseen domains via distribution matching

This work focuses on domain generalization: a formalization where the data generating process at test time may yield samples from never-before-seen domains (distributions), and relies on a simple lemma to derive a generalization bound for this setting.

Environment Inference for Invariant Learning

EIIL is proposed, a general framework for domain-invariant learning that incorporates Environment Inference to directly infer partitions that are maximally informative for downstream Invariant Learning and establishes connections between EIIL and algorithmic fairness.

Generalizing to Unseen Domains via Adversarial Data Augmentation

This work proposes an iterative procedure that augments the dataset with examples from a fictitious target domain that is "hard" under the current model, and shows that the method is an adaptive data augmentation method where the authors append adversarial examples at each iteration.

Weakly Supervised Disentanglement with Guarantees

A theoretical framework is provided to assist in analyzing the disentanglement guarantees (or lack thereof) conferred by weak supervision when coupled with learning algorithms based on distribution matching and empirically verify the guarantees and limitations of several weak supervision methods.

Gradient Matching for Domain Generalization

An inter-domain gradient matching objective that targets domain generalization by maximizing the inner product between gradients from different domains is proposed and a simpler first-order algorithm named Fish is derived that approximates its optimisation.

Self-Supervised Learning Disentangled Group Representation as Feature

This paper proposes an iterative SSL algorithm: Iterative Partition-based Invariant Risk Minimization (IP-IRM), which successfully grounds the abstract semantics and the group acting on them into concrete contrastive learning.

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

This paper theoretically shows that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data, and trains more than 12000 models covering most prominent methods and evaluation metrics on seven different data sets.

Linear Disentangled Representations and Unsupervised Action Estimation

It is empirically shown that linear disentangled representations are not present in standard VAE models and that they instead require altering the loss landscape to induce them, and it is shown that such representations are a desirable property with regard to classical disentanglement metrics.

Learning to Compose Domain-Specific Transformations for Data Augmentation

The proposed method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data, which can be used to perform data augmentation for any end discriminative model.