Explainability in reinforcement learning: perspective and position

  title={Explainability in reinforcement learning: perspective and position},
  author={Agneza Krajna and Mario Br{\vc}i{\vc} and Tomislav Lipi{\'c} and Juraj Doncevic},
Artificial intelligence (AI) has been embedded into many aspects of people’s daily lives and it has become normal for people to have AI make decisions for them. From helping users to find their favorite items to purchase, recommending movies and friends on Facebook, to life-essential decisions. Reinforcement learning (RL) models increase the space of solvable problems with respect to other machine learning paradigms. Some of the most interesting applications are in situations with… 

Figures and Tables from this paper

Explainable Artificial Intelligence: An Updated Perspective

This research offers an update on the current state of explainable AI (XAI), identifying new frontiers of research, explainability of reinforcement learning and graph neural networks, and gives a detailed overview of the field.

Prismal view of ethics

Sometimes one needs look back to find out where she is going. Abstract We shall have a hard look at ethics and try to extract insights in the form of abstract properties that might become tools. We



Feature-Based Interpretable Reinforcement Learning based on State-Transition Models

  • Omid DavoodiM. Komeili
  • Computer Science
    2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
  • 2021
The effectiveness of the proposed method for offering local explanations on risk in reinforcement learning in terms of the features most influential in increasing or decreasing risk or any other desirable objective function in the locality of the agent is demonstrated.

Explainable Deep Reinforcement Learning for UAV Autonomous Navigation

An explainable deep reinforcement learning method is proposed to deal with the multirotor obstacle avoidance and navigation problem and can get useful explanations to increase the user's trust to the network and also improve the network performance.

Explainable Reinforcement Learning: A Survey

It is found that a) the majority of XRL methods function by mimicking and simplifying a complex model instead of designing an inherently simple one, and b) XRL (and XAI) methods often neglect to consider the human side of the equation, not taking into account research from related fields like psychology or philosophy.

Distal Explanations for Explainable Reinforcement Learning Agents

A distal explanation model that can analyse counterfactuals and opportunity chains using decision trees and causal models is introduced and investigates the participants' understanding of the agent through task prediction and their subjective satisfaction of the explanations and shows that the model performs better in task prediction.

Explainable Reinforcement Learning via Reward Decomposition

This work exploits an off-policy variant of Qlearning that provably converges to an optimal policy and the correct decomposed action values, and introduces the concept of minimum sufficient explanations for compactly explaining why one action is preferred over another in terms of the types.

Distilling Deep Reinforcement Learning Policies in Soft Decision Trees

This paper illustrates how Soft Decision Tree (SDT) distillation can be used to make policies that are learned through RL more interpretable and realizes preliminary steps towards interpreting the learned behavior of the policy.

Explainable Reinforcement Learning Through a Causal Lens

This paper presents an approach that learns a structural causal model during reinforcement learning and encodes causal relationships between variables of interest and shows that causal model explanations perform better on these measures compared to two other baseline explanation models.

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

This study proposes a method that enables a RL agent to explain its behavior in terms of the expected consequences of state transitions and outcomes, and developed a procedure that enables the agent to obtain the consequences of a single action, as well as its entire policy.

Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning

This paper proposes a novel framework for efficient multi-task reinforcement learning that trains agents to employ hierarchical policies that decide when to use a previously learned policy and when to learn a new skill.

Improving Robot Controller Transparency Through Autonomous Policy Explanation

  • Bradley HayesJ. Shah
  • Computer Science
    2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI
  • 2017
This work presents a series of algorithms and an accompanying system that enables robots to autonomously synthesize policy descriptions and respond to both general and targeted queries by human collaborators, demonstrating applicability to a variety of robot controller types.