• Corpus ID: 244117501

Learning Generalized Gumbel-max Causal Mechanisms

  title={Learning Generalized Gumbel-max Causal Mechanisms},
  author={Guy Lorberbom and Daniel D. Johnson and Chris J. Maddison and Daniel Tarlow and Tamir Hazan},
  booktitle={Neural Information Processing Systems},
To perform counterfactual reasoning in Structural Causal Models (SCMs), one needs to know the causal mechanisms, which provide factorizations of conditional distributions into noise sources and deterministic functions mapping realizations of noise to samples. Unfortunately, the causal mechanism is not uniquely identified by data that can be gathered by observing and interacting with the world, so there remains the question of how to choose causal mechanisms. In recent work, Oberst & Sontag… 

Figures and Tables from this paper

Counterfactual (Non-)identifiability of Learned Structural Causal Models

This work warns practitioners about non-identifiability of counterfactual inference from observational data, even in the absence of unobserved confounding and assuming known causal structure, and provides an impossibility result for counterfactuality identifiability for general generation mechanisms with multi-dimensional exogenous variables.

Estimating Categorical Counterfactuals via Deep Twin Networks

This work introduces the notion of counterfactual ordering, a principle that posits desirable properties causal mechanisms should posses, and proves that it is equivalent to specific functional constraints on the causal mechanisms.

Counterfactual Analysis in Dynamic Models: Copulas and Bounds

The entire space of SCMs obeying counterfactual stability (CS) is characterized, and it is used to negatively answer the open question of Oberst and Sontag regarding the uniqueness of the Gumbel-max mechanism for modeling CS.

Deep Counterfactual Estimation with Categorical Background Variables

This work introduces CounterFactual Query Prediction (CFQP), a novel method to infer counterfactuals from continuous observations when the background variables are categorical, and shows that the method significantly outperforms previously available deep-learning-basedcounterfactual methods, both theoretically and empirically on time series and image data.

Counterfactual Inference of Second Opinions

A set invariant Gumbel-Max structural causal model is designed where the structure of the noise governing the sub-mechanisms underpinning the model depends on an intuitive notion of similarity between experts which can be estimated from data.

Causal Graph Discovery from Self and Mutually Exciting Time Series

A generalized linear structural causal model, coupled with a novel data-adaptive linear regularization, to recover causal directed acyclic graphs (DAGs) from time series while achieving comparable prediction performance to powerful “black-box” models such as XGBoost.

Counterfactual Analysis in Dynamic Latent State Models

To the best of the knowledge, this work is the first to compute lower and upper bounds on a counterfactual query in a dynamic latent-state model and applies it on a breast cancer case study.



Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation

A data-efficient RL algorithm is proposed that exploits structural causal models (SCMs) to model the state dynamics, which are estimated by leveraging both commonalities and differences across subjects and converges to the optimal value function.

Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search

The Counterfactually-Guided Policy Search (CF-GPS) algorithm is proposed, which leverages structural causal models for counterfactual evaluation of arbitrary policies on individual off-policy episodes and can improve on vanilla model-based RL algorithms by making use of available logged data to de-bias model predictions.

Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models

An off-policy evaluation procedure for highlighting episodes where applying a reinforcement learned policy is likely to have produced a substantially different outcome than the observed policy, and a class of structural causal models for generating counterfactual trajectories in finite partially observable Markov Decision Processes (POMDPs).

On Pearl’s Hierarchy and the Foundations of Causal Inference

This chapter develops a novel and comprehensive treatment of the Pearl Causal Hierarchy through two complementary lenses: one logical-probabilistic and another inferential-graphical, and investigates an inferential system known as do-calculus, showing how it can be suf­ ficient, and in many cases necessary, to allow inferences across the PCH’s layers.

Probabilities Of Causation: Three Counterfactual Interpretations And Their Identification

It is shown thatnecessity and sufficiency are two independent aspects of causation, and that both should be invoked in the construction of causal explanations for specific scenarios.

Some Guidelines and Guarantees for Common Random Numbers

Common random numbers CRN is a widely-used technique for reducing variance in comparing stochastic systems through simulation. Its popularity derives from its intuitive appeal and ease of

Categorical Reparameterization with Gumbel-Softmax

It is shown that the Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.

The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables

Concrete random variables---continuous relaxations of discrete random variables is a new family of distributions with closed form densities and a simple reparameterization, and the effectiveness of Concrete relaxations on density estimation and structured prediction tasks using neural networks is demonstrated.

Learning with a Wasserstein Loss

An efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures, which can encourage smoothness of the predictions with respect to a chosen metric on the output space.

A Poisson process model for Monte Carlo

Simulating samples from arbitrary probability distributions is a major research program of statistical computing. Recent work has shown promise in an old idea, that sampling from a discrete