Causal Transportability for Visual Recognition

  title={Causal Transportability for Visual Recognition},
  author={Chengzhi Mao and Kevin Xia and James Wang and Hongya Wang and Junfeng Yang and Elias Bareinboim and Carl Vondrick},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
Visual representations underlie object recognition tasks, but they often contain both robust and non-robust features. Our main observation is that image classifiers may perform poorly on out-of-distribution samples because spurious correlations between non-robust features and labels can be changed in a new environment. By analyzing procedures for out-of-distribution generalization with a causal graph, we show that standard classifiers fail because the association between images and labels is… 

Figures and Tables from this paper

Doubly Right Object Recognition: A Why Prompt for Visual Rationales

By trans-ferring the rationales from language models into visual representations through a tailored dataset, it is shown that a “why prompt,” which adapts large visual representations to produce correct rationales, can be learned.

Front-door Adjustment via Style Transfer for Out-of-distribution Generalisation

Inspired by capability of style transfer in image generation, the combination of the mediator variable with different generated images in the front-door formula is interpreted and novel algorithms to estimate it are proposed.

Avoiding Calvinist Decision Traps using Structural Causal Models

An adversarial scheme where a CDT agent facing a Bandit problem can be tricked into sub-optimal choices, if it follows temporal CDT, and an axiom to ground the orientation of arrows in the causal graph of a decision problem is proposed.

Evaluating the Impact of Geometric and Statistical Skews on Out-Of-Distribution Generalization

Out-of-distribution (OOD) or domain generalization is the problem of generalizing to unseen distributions that arises due to spurious correlations, which arise due to statistical and geometric skews.



Generative Interventions for Causal Learning

Experiments, visualizations, and theoretical results show this method learns robust representations more consistent with the underlying causal relationships, and improves performance on multiple datasets demanding out-of-distribution generalization.

ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models

A highly automated platform that enables gathering datasets with controls at scale using automated tools throughout machine learning to generate datasets that exercise models in new ways thus providing valuable feedback to researchers is developed.

Domain Generalization using Causal Matching

An iterative algorithm called MatchDG is proposed that approximates base object similarity by using a contrastive loss formulation adapted for multiple domains and learns matches that have over 25\% overlap with ground-truth object matches in MNIST and Fashion-MNIST.

Towards Shape Biased Unsupervised Representation Learning for Domain Generalization

This work proposes a learning framework to improve the shape bias property of self-supervised methods by integrating domain diversification and jigsaw puzzles and shows that this framework outperforms state-of-the-art domain generalization methods by a large margin.

Transporting Causal Mechanisms for Unsupervised Domain Adaptation

Transporting Causal Mechanisms (TCM) is proposed, to identify the confounder stratum and representations by using the domain-invariant disentangled causal mechanisms, which are discovered in an unsupervised fashion.

Learning Robust Global Representations by Penalizing Local Predictive Power

A method for training robust convolutional networks by penalizing the predictive power of the local representations learned by earlier layers, which forces networks to discard predictive signals such as color and texture that can be gleaned from local receptive fields and to rely instead on the global structures of the image.

Learning Transferable Visual Models From Natural Language Supervision

It is demonstrated that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet.

Invariant Models for Causal Transfer Learning

This work relaxes the usual covariate shift assumption and assumes that it holds true for a subset of predictor variables: the conditional distribution of the target variable given this subset of predictors is invariant over all tasks.

A Simple Framework for Contrastive Learning of Visual Representations

It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

This paper proposes an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons, and uses a swapped prediction mechanism where it predicts the cluster assignment of a view from the representation of another view.