Causal versus Marginal Shapley Values for Robotic Lever Manipulation Controlled using Deep Reinforcement Learning

  title={Causal versus Marginal Shapley Values for Robotic Lever Manipulation Controlled using Deep Reinforcement Learning},
  author={Sindre Benjamin Remman and Inga Str{\"u}mke and Anastasios M. Lekkas},
  journal={2022 American Control Conference (ACC)},
We investigate the effect of including application knowledge about a robotic system states’ causal relations when generating explanations of deep neural network policies. To this end, we compare two methods from explainable artificial intelligence, KernelSHAP, and causal SHAP, on a deep neural network trained using deep reinforcement learning on the task of controlling a lever using a robotic manipulator. A primary disadvantage of KernelSHAP is that its explanations represent only the features… 

Explainability in Deep Reinforcement Learning, a Review into Current Methods and Applications

The use of Deep Reinforcement Learning schemes has increased dramatically since their introduction in 2015 and a review looks at which methods are being used and what applications they are being using to identify which models are the best suited to each application or if a method is being underutilised.



Robotic Lever Manipulation using Hindsight Experience Replay and Shapley Additive Explanations

This paper trains a policy by using the Deep Deterministic Policy Gradient algorithm and the Hindsight Experience Replay technique to control a robotic manipulator to manipulate a lever, and uses the SHAP method to create an explanation model based on the episodes done in the real-world environment.

Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models

A novel framework for computing Shapley values that generalizes recent work that aims to circumvent the independence assumption is proposed and it is shown how these 'causal' Shapleyvalues can be derived for general causal graphs without sacrificing any of their desirable properties.

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.

Explaining a Deep Reinforcement Learning Docking Agent Using Linear Model Trees with User Adapted Visualization

Deep neural networks (DNNs) can be useful within the marine robotics field, but their utility value is restricted by their black-box nature. Explainable artificial intelligence methods attempt to

Hindsight Experience Replay

A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum.

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

QT-Opt is introduced, a scalable self-supervised vision-based reinforcement learning framework that can leverage over 580k real-world grasp attempts to train a deep neural network Q-function with over 1.2M parameters to perform closed-loop, real- world grasping that generalizes to 96% grasp success on unseen objects.

End-to-End Training of Deep Visuomotor Policies

This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.

Continuous control with deep reinforcement learning

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability

Asymmetric Shapley values can improve model explanations by incorporating causal information, provide an unambiguous test for unfair discrimination in model predictions, enable sequentially incremental explanations in time-series models, and support feature-selection studies without the need for model retraining.