Corpus ID: 212717971

Invariant Causal Prediction for Block MDPs

@inproceedings{Zhang2020InvariantCP,
  title={Invariant Causal Prediction for Block MDPs},
  author={A. Zhang and Clare Lyle and Shagun Sodhani and Angelos Filos and M. Kwiatkowska and Joelle Pineau and Y. Gal and Doina Precup},
  booktitle={ICML},
  year={2020}
}
Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challenges. In this paper, we consider the problem of learning abstractions that generalize in block MDPs, families of environments with a shared latent state space and dynamics structure over that latent space, but varying observations. We leverage tools from causal inference to propose a method of invariant prediction to learn model-irrelevance state abstractions… Expand
Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion
We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions ofExpand
Visual Transfer for Reinforcement Learning via Wasserstein Confusion
We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions ofExpand
Domain Adversarial Reinforcement Learning
TLDR
This work enforces invariance of the learned representations to visual domains via a domain adversarial optimization process and empirically shows that this approach allows achieving a significant generalization improvement to new unseen domains. Expand
Model-Invariant State Abstractions for Model-Based Reinforcement Learning
TLDR
This paper introduces a new type of state abstraction called model-invariance, which allows for generalization to novel combinations of unseen values of state variables, something that non-factored forms of state abstractions cannot do. Expand
Intervention Design for Effective Sim2Real Transfer
TLDR
It is found that perturbations to the environment do not have to be realistic, but merely show variation along dimensions that also vary in the real world, and use of an explicit invariance-inducing objective improves generalization in sim2sim and sim2real transfer settings over just data augmentation or domain randomization alone. Expand
Learning Invariant Representations for Reinforcement Learning without Reconstruction
TLDR
This work studies how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction, and proposes a method to learn robust latent representations which encode only the task-relevant information from observations. Expand
AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning
TLDR
This paper constructs a generative environment model for the structural relationships among variables in the system and embeds changes in a compact way, which provides a clear and interpretable picture for locating what and where the changes are and how to adapt. Expand
Causal Inference Q-Network: Toward Resilient Reinforcement Learning
TLDR
Under this framework, the importance of the causal relation is discussed and a causal inference based DRL algorithm called causal inference Q-network (CIQ) is proposed, and experimental results show that the proposed CIQ method could achieve higher performance and more resilience against observational interferences. Expand
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
TLDR
A theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states is introduced and it is demonstrated that PSEs improve generalization on diverse benchmarks, including LQR with spurious correlations, a jumping task from pixels, and Distracting DM Control Suite. Expand
Decoupling Value and Policy for Generalization in Reinforcement Learning
TLDR
First, IDAAC decouples the optimization of the policy and value function, using separate networks to model them, and introduces an auxiliary loss which encourages the representation to be invariant to task-irrelevant properties of the environment. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 48 REFERENCES
Invariant Risk Minimization
TLDR
This work introduces Invariant Risk Minimization, a learning paradigm to estimate invariant correlations across multiple training distributions and shows how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization. Expand
Algorithmic Framework for Model-based Reinforcement Learning with Theoretical Guarantees
TLDR
A novel algorithmic framework for designing and analyzing model-based RL algorithms with theoretical guarantees is introduced and a meta-algorithm with a theoretical guarantee of monotone improvement to a local maximum of the expected reward is designed. Expand
Provably efficient RL with Rich Observations via Latent State Decoding
TLDR
This work demonstrates how to estimate a mapping from the observations to latent states inductively through a sequence of regression and clustering steps inductively and uses it to construct good exploration policies. Expand
Causal inference using invariant prediction: identification and confidence intervals
TLDR
This work proposes to exploit invariance of a prediction under a causal model for causal inference: given different experimental settings (for example various interventions) the authors collect all models that do show invariance in their predictive accuracy across settings and interventions and yields valid confidence intervals for the causal relationships in quite general scenarios. Expand
Towards a Unified Theory of State Abstraction for MDPs
TLDR
This work provides a unified treatment of state abstraction for Markov decision processes by studying five particular abstraction schemes, some of which have been proposed in the past in different forms, and analyzing their usability for planning and learning. Expand
DeepMind Control Suite
The DeepMind Control Suite is a set of continuous control tasks with a standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents.Expand
Causality: Models, Reasoning and Inference
1. Introduction to probabilities, graphs, and causal models 2. A theory of inferred causation 3. Causal diagrams and the identification of causal effects 4. Actions, plans, and direct effects 5.Expand
Fantastic Generalization Measures and Where to Find Them
TLDR
This work presents the first large scale study of generalization in deep networks, investigating more then 40 complexity measures taken from both theoretical bounds and empirical studies and showing surprising failures of some measures as well as promising measures for further research. Expand
Meta-Learning without Memorization
TLDR
This paper designs a meta-regularization objective using information theory that places precedence on data-driven adaptation and demonstrates its applicability to both contextual and gradient-based meta-learning algorithms, and applies it in practical settings where applying standard meta- learning has been difficult. Expand
Observational Overfitting in Reinforcement Learning
TLDR
This work provides a general framework for analyzing the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process, and designs multiple synthetic benchmarks from only modifying the observation space of an MDP. Expand
...
1
2
3
4
5
...