# Invariant Risk Minimization

@article{Arjovsky2019InvariantRM, title={Invariant Risk Minimization}, author={Mart{\'i}n Arjovsky and L{\'e}on Bottou and Ishaan Gulrajani and David Lopez-Paz}, journal={ArXiv}, year={2019}, volume={abs/1907.02893} }

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.

## 499 Citations

Kernelized Heterogeneous Risk Minimization

- Computer ScienceNeurIPS
- 2021

This paper proposes Kernelized Heterogeneous Risk Minimization (KerHRM), which achieves both the latent heterogeneity exploration and invariant learning in kernel space, and then gives feedback to the original neural network by appointing invariant gradient direction.

CONTINUAL INVARIANT RISK MINIMIZATION

- Computer Science
- 2020

This work generalizes the concept of IRM to scenarios where environments are observed sequentially, and extends IRM under a variational Bayesian and bilevel framework, creating a general approach to continual invariant risk minimization.

Optimal Representations for Covariate Shifts

- Computer Science
- 2021

A simple objective whose optima are exactly all representations on which risk minimizers are guaranteed to be robust to Bayes-preserving shifts, e.g., covariate shifts is introduced.

Generalized Invariant Risk Minimization: relating adaptation and invariant representation learning

- Computer Science
- 2020

This work introduces Generalized Invariant Risk Minimization (G-IRM), a technique that takes a pre-specified adaptation mechanism and aims to find invariant representations that (a) perform well across multiple different training environments and (b) cannot be improved through adaptation to individual environments.

Invariant Risk Minimization Games

- Computer ScienceICML
- 2020

A simple training algorithm is developed that uses best response dynamics and yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al. (2019).

On Invariance Penalties for Risk Minimization

- MathematicsArXiv
- 2021

This work proposes an alternative invariance penalty by revisiting the Gramian matrix of the data representation, and discusses the role of its eigenvalues in the relationship between the risk and the invariant penalty, and demonstrates that it is ill-conditioned for said counterexamples.

Optimal Representations for Covariate Shift

- Computer ScienceArXiv
- 2022

A simple variational objective whose optima are exactly the set of all representations on which risk minimizers are guaranteed to be robust to any distribution shift that preserves the Bayes predictor, e.g., covariate shifts is introduced.

Near-Optimal Linear Regression under Distribution Shift

- Mathematics, Computer ScienceICML
- 2021

This work develops estimators that achieve minimax linear risk for linear regression problems under distribution shift and shows that linear minimax estimators are within an absolute constant of the minimax risk even among nonlinear estimators for various source/target distributions.

Conditional entropy minimization principle for learning domain invariant representation features

- Computer ScienceArXiv
- 2022

This paper theoretically proves that under some particular assumptions, the representation function can precisely recover the true invariant features and shows that the proposed approach is closely related to the well-known Information Bottleneck framework.

Balancing Fairness and Robustness via Partial Invariance

- Computer ScienceArXiv
- 2021

The results show the capability of the partial invariant risk minimization to alleviate the trade-off between fairness and risk in certain settings and introduce flexibility into the IRM framework by partitioning the environments based on hierarchical differences, while enforcing invariance locally within the partitions.

## References

SHOWING 1-10 OF 57 REFERENCES

Principles of Risk Minimization for Learning Theory

- Computer ScienceNIPS
- 1991

Systematic improvements in prediction power and empirical risk minimization are illustrated in application to zip-code recognition.

Invariant Models for Causal Transfer Learning

- Computer ScienceJ. Mach. Learn. Res.
- 2018

This work relaxes the usual covariate shift assumption and assumes that it holds true for a subset of predictor variables: the conditional distribution of the target variable given this subset of predictors is invariant over all tasks.

On causal and anticausal learning

- Computer ScienceICML
- 2012

The problem of function estimation in the case where an underlying causal model can be inferred is considered, and a hypothesis for when semi-supervised learning can help is formulated, and corroborate it with empirical results.

Robust Supervised Learning

- Computer ScienceAAAI
- 2005

This work considers a novel framework where a learner may influence the test distribution in a bounded way and derives an efficient algorithm that acts as a wrapper around a broad class of existing supervised learning algorithms while guarranteeing more robust behavior under changes in the input distribution.

Stable Prediction across Unknown Environments

- Computer ScienceKDD
- 2018

This paper proposes a novel Deep Global Balancing Regression (DGBR) algorithm to jointly optimize a deep auto-encoder model for feature selection and a global balancing model for stable prediction across unknown environments, and demonstrates that the algorithm outperforms the state-of-the-art methods for stable predictions acrossunknown environments.

Learning Causal Structures Using Regression Invariance

- Computer ScienceNIPS
- 2017

A notion of completeness for a causal inference algorithm in this setting is defined and an alternate algorithm is presented that has significantly improved computational and sample complexity compared to the baseline algorithm.

Analysis of Representations for Domain Adaptation

- Computer ScienceNIPS
- 2006

The theory illustrates the tradeoffs inherent in designing a representation for domain adaptation and gives a new justification for a recently proposed model which explicitly minimizes the difference between the source and target domains, while at the same time maximizing the margin of the training set.

Deep Domain Generalization via Conditional Invariant Adversarial Networks

- Computer ScienceECCV
- 2018

This work proposes an end-to-end conditional invariant deep domain generalization approach by leveraging deep neural networks for domain-invariant representation learning and proves the effectiveness of the proposed method.

Statistical learning theory

- Computer Science
- 1998

Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

Invariant Causal Prediction for Nonlinear Models

- Computer ScienceJournal of Causal Inference
- 2018

This work presents and evaluates an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables and finds that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings.