• Corpus ID: 195820364

Invariant Risk Minimization

@article{Arjovsky2019InvariantRM,
  title={Invariant Risk Minimization},
  author={Mart{\'i}n Arjovsky and L{\'e}on Bottou and Ishaan Gulrajani and David Lopez-Paz},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.02893}
}
We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization. 

Figures and Tables from this paper

Kernelized Heterogeneous Risk Minimization
TLDR
This paper proposes Kernelized Heterogeneous Risk Minimization (KerHRM), which achieves both the latent heterogeneity exploration and invariant learning in kernel space, and then gives feedback to the original neural network by appointing invariant gradient direction.
CONTINUAL INVARIANT RISK MINIMIZATION
  • Computer Science
  • 2020
TLDR
This work generalizes the concept of IRM to scenarios where environments are observed sequentially, and extends IRM under a variational Bayesian and bilevel framework, creating a general approach to continual invariant risk minimization.
Optimal Representations for Covariate Shifts
TLDR
A simple objective whose optima are exactly all representations on which risk minimizers are guaranteed to be robust to Bayes-preserving shifts, e.g., covariate shifts is introduced.
Generalized Invariant Risk Minimization: relating adaptation and invariant representation learning
TLDR
This work introduces Generalized Invariant Risk Minimization (G-IRM), a technique that takes a pre-specified adaptation mechanism and aims to find invariant representations that (a) perform well across multiple different training environments and (b) cannot be improved through adaptation to individual environments.
Invariant Risk Minimization Games
TLDR
A simple training algorithm is developed that uses best response dynamics and yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al. (2019).
On Invariance Penalties for Risk Minimization
TLDR
This work proposes an alternative invariance penalty by revisiting the Gramian matrix of the data representation, and discusses the role of its eigenvalues in the relationship between the risk and the invariant penalty, and demonstrates that it is ill-conditioned for said counterexamples.
Optimal Representations for Covariate Shift
TLDR
A simple variational objective whose optima are exactly the set of all representations on which risk minimizers are guaranteed to be robust to any distribution shift that preserves the Bayes predictor, e.g., covariate shifts is introduced.
Near-Optimal Linear Regression under Distribution Shift
TLDR
This work develops estimators that achieve minimax linear risk for linear regression problems under distribution shift and shows that linear minimax estimators are within an absolute constant of the minimax risk even among nonlinear estimators for various source/target distributions.
Conditional entropy minimization principle for learning domain invariant representation features
TLDR
This paper theoretically proves that under some particular assumptions, the representation function can precisely recover the true invariant features and shows that the proposed approach is closely related to the well-known Information Bottleneck framework.
Balancing Fairness and Robustness via Partial Invariance
TLDR
The results show the capability of the partial invariant risk minimization to alleviate the trade-off between fairness and risk in certain settings and introduce flexibility into the IRM framework by partitioning the environments based on hierarchical differences, while enforcing invariance locally within the partitions.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 57 REFERENCES
Principles of Risk Minimization for Learning Theory
TLDR
Systematic improvements in prediction power and empirical risk minimization are illustrated in application to zip-code recognition.
Invariant Models for Causal Transfer Learning
TLDR
This work relaxes the usual covariate shift assumption and assumes that it holds true for a subset of predictor variables: the conditional distribution of the target variable given this subset of predictors is invariant over all tasks.
On causal and anticausal learning
TLDR
The problem of function estimation in the case where an underlying causal model can be inferred is considered, and a hypothesis for when semi-supervised learning can help is formulated, and corroborate it with empirical results.
Robust Supervised Learning
TLDR
This work considers a novel framework where a learner may influence the test distribution in a bounded way and derives an efficient algorithm that acts as a wrapper around a broad class of existing supervised learning algorithms while guarranteeing more robust behavior under changes in the input distribution.
Stable Prediction across Unknown Environments
TLDR
This paper proposes a novel Deep Global Balancing Regression (DGBR) algorithm to jointly optimize a deep auto-encoder model for feature selection and a global balancing model for stable prediction across unknown environments, and demonstrates that the algorithm outperforms the state-of-the-art methods for stable predictions acrossunknown environments.
Learning Causal Structures Using Regression Invariance
TLDR
A notion of completeness for a causal inference algorithm in this setting is defined and an alternate algorithm is presented that has significantly improved computational and sample complexity compared to the baseline algorithm.
Analysis of Representations for Domain Adaptation
TLDR
The theory illustrates the tradeoffs inherent in designing a representation for domain adaptation and gives a new justification for a recently proposed model which explicitly minimizes the difference between the source and target domains, while at the same time maximizing the margin of the training set.
Deep Domain Generalization via Conditional Invariant Adversarial Networks
TLDR
This work proposes an end-to-end conditional invariant deep domain generalization approach by leveraging deep neural networks for domain-invariant representation learning and proves the effectiveness of the proposed method.
Statistical learning theory
TLDR
Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Invariant Causal Prediction for Nonlinear Models
TLDR
This work presents and evaluates an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables and finds that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings.
...
1
2
3
4
5
...