• Corpus ID: 222140759

Safety Aware Reinforcement Learning (SARL)

  title={Safety Aware Reinforcement Learning (SARL)},
  author={Santiago Miret and Somdeb Majumdar and Carroll L. Wainwright},
As reinforcement learning agents become increasingly integrated into complex, real-world environments, designing for safety becomes a critical consideration. We specifically focus on researching scenarios where agents can cause undesired side effects while executing a policy on a primary task. Since one can define multiple tasks for a given environment dynamics, there are two important challenges. First, we need to abstract the concept of safety that applies broadly to that environment… 

Figures and Tables from this paper

Towards Safe Reinforcement Learning with a Safety Editor Policy

SEditor is presented, a two-policy approach that learns a safety editor policy transforming potentially unsafe actions proposed by a utility maximizer policy into safe ones, and demonstrates outstanding utility performance with constraint violation rates as low as once per 2k.



Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments, and then plans to avoid actions that could lead to catastrophic states when exploring a new safety-critical environment with unknown dynamics.

A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

A generalized version of the Bellman equation is proposed to learn a single parametric representation for optimal policies over the space of all possible preferences in MORL, with the goal of enabling few-shot adaptation to new tasks.

SafeLife 1.0: Exploring Side Effects in Complex Environments

We present SafeLife, a publicly available reinforcement learning environment that tests the safety of reinforcement learning agents. It contains complex, dynamic, tunable, procedurally generated

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.

Measuring and avoiding side effects using relative reachability

A general definition of side effects is introduced, based on relative reachability of states compared to a default state, that avoids these undesirable incentives in tasks that require irreversible actions and in environments that contain sources of change other than the agent.

The Ingredients of Real-World Robotic Reinforcement Learning

This work discusses the required elements of a robotic system that can continually and autonomously improve with data collected in the real world, and proposes a particular instantiation of such a system, and demonstrates the efficacy of this proposed system on dexterous robotic manipulation tasks in simulation and thereal world.

Concrete Problems in AI Safety

A list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function, an objective function that is too expensive to evaluate frequently, or undesirable behavior during the learning process, are presented.

Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

This work proposes an efficient evolutionary learning algorithm to find the Pareto set approximation for continuous robot control problems, by extending a state-of-the-art RL algorithm and presenting a novel prediction model to guide the learning process.

Diversity is All You Need: Learning Skills without a Reward Function

The proposed DIAYN ("Diversity is All You Need"), a method for learning useful skills without a reward function, learns skills by maximizing an information theoretic objective using a maximum entropy policy.

Penalizing Side Effects using Stepwise Relative Reachability

A new variant of the stepwise inaction baseline and a new deviation measure based on relative reachability of states are introduced that avoids the given undesirable incentives, while simpler baselines and the unreachability measure fail.