Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning

  title={Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning},
  author={Zhecheng Yuan and Guozheng Ma and Yao Mu and Bo Xia and Bo Yuan and Xueqian Wang and Ping Luo and Huazhe Xu},
One of the key challenges in visual Reinforcement Learning (RL) is to learn policies that can generalize to unseen environments. Recently, data augmentation techniques aiming at enhancing data diversity have demonstrated proven performance in improving the generalization ability of learned policies. However, due to the sensitivity of RL training, naively applying data augmentation, which transforms each pixel in a task-agnostic manner, may suffer from instability and damage the sample… 

Look where you look! Saliency-guided Q-networks for visual RL tasks

SGQN vastly improves the generalization capability of Soft Actor-Critic agents and outperforms existing state-of-the-art methods on the Deepmind Control Generalization benchmark, setting a new reference in terms of training efficiency, generalization gap, and policy interpretability.

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

A principled taxonomy of the existing augmentation techniques used in visual RL and an in-depth discussion on how to better leverage augmented data in di erent scenarios are presented.

CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer

This work carefully design a contrastive reinforcement learning paradigm to train CtrlFormer, en-abling it to achieve high sample efficiency, which is important in control problems.



SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

This work considers robust policy learning which targets zero-shot generalization to unseen visual environments with large distributional shift and proposes SECANT, a novel self-expert cloning technique that leverages image augmentation in two stages to decouple robust representation learning from policy optimization.

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

This paper investigates causes of instability when using data augmentation in common off-policy RL algorithms and proposes a simple yet effective technique for stabilizing this class of algorithms under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL in environments with unseen visuals.

Automatic Data Augmentation for Generalization in Reinforcement Learning

This paper introduces three approaches for automatically finding an effective augmentation for any RL task, combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for actor-critic algorithms.

Generalization in Reinforcement Learning by Soft Data Augmentation

SOft Data Augmentation (SODA) is proposed, a method that decouples augmentation from policy learning and is found to significantly advance sample efficiency, generalization, and stability in training over state-of-the-art vision-based RL methods.

Reinforcement Learning with Augmented Data

It is shown that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods across common benchmarks.

Learning Invariant Representations for Reinforcement Learning without Reconstruction

This work studies how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction, and proposes a method to learn robust latent representations which encode only the task-relevant information from observations.

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

The addition of the augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based methods and recently proposed contrastive learning (CURL).

Assessing Generalization in Deep Reinforcement Learning

The key finding is that `vanilla' deep RL algorithms generalize better than specialized schemes that were proposed specifically to tackle generalization.

Target-driven visual navigation in indoor scenes using deep reinforcement learning

This paper proposes an actor-critic model whose policy is a function of the goal as well as the current state, which allows better generalization and proposes the AI2-THOR framework, which provides an environment with high-quality 3D scenes and a physics engine.

Quantifying Generalization in Reinforcement Learning

It is shown that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization.