Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

  title={Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees},
  author={Guiliang Liu and Oliver Schulte and Wang Zhu and Qingcan Li},
Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. [] Key Method We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent…

EDGE: Explaining Deep Reinforcement Learning Policies

A novel self-explainable model is proposed that augments a Gaussian process with a customized kernel function and an interpretable predictor and can predict an agent’s final rewards from its game episodes and extract time step importance within episodes as strategy-level explanations for that agent.

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

This survey provides a comprehensive review of existing works on eXplainable RL (XRL) and introduces a new taxonomy where prior works are clearly categorized into model-explaining, reward- Explaining, state-Explaining, and task-explained methods.

Zoom In on Agents: Interpreting Deep Reinforcement Learning Models

Focusing on the first and fourth layers, it is found that layer-1 neurons tend to be interpretable as polysemantic representations of one of several distinct objects in the game Breakthrough, identifying particular objects’ locations, their direction of motion, and–in some cases–their destruction.

Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning

A novel Minimum Description Length (MDL) objective based on the Information Bottleneck (IB) principle is derived and a Monte Carlo Regression Tree Search (MCRTS) algorithm that explores different splits to find the IB-optimal mimic tree is described.

Discovering symbolic policies with deep reinforcement learning

This work uses an autoregressive recurrent neural network to generate control policies represented by tractable mathematical expressions and proposes an “anchoring” algorithm that distills pre-trained neural network-based policies into fully symbolic policies, one action dimension at a time, to scale to environments with multidimensional action spaces.

Designing Interpretable Approximations to Deep Reinforcement Learning with Soft Decision Trees

This work seeks to provide a quantitative framework with metrics to systematically evaluate the outcome of conversion processes, and identify reduced models that not only preserve a desired performance level, but also succinctly explain the latent knowledge represented by a DNN.

A Survey on Interpretable Reinforcement Learning

This survey provides an overview of various approaches to achieve higher interpretability in reinforcement learning and argues that interpretable RL may embrace different facets: interpretable inputs, interpretable (transition/reward) models, and interpretable decision-making.

Policy Extraction via Online Q-Value Distillation

This thesis introduces Q-BSP Trees and an Ordered Sequential Monte Carlo training algorithm that helps condense the Q-function from fully trained Deep Q-Networks into the tree structure and convincingly beats performance benchmarks provided by earlier policy distillation methods.

Distilling Deep Reinforcement Learning Policies in Soft Decision Trees

This paper illustrates how Soft Decision Tree (SDT) distillation can be used to make policies that are learned through RL more interpretable and realizes preliminary steps towards interpreting the learned behavior of the policy.

CDT: Cascading Decision Trees for Explainable Reinforcement Learning

Cascading Decision Trees (CDTs) apply representation learning on the decision path to allow richer expressivity and show that in both situations, where CDTs are used as policy function approximators or as imitation learners to explain black-box policies, CDTs can achieve better performances with more succinct and explainable models than SDTs.



Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Tree Based Discretization for Continuous State Space Reinforcement Learning

This paper extends the U Tree algorithm to challenging domains with a continuous state space for which there is no initial discretization and transfers traditional regression tree techniques to reinforcement learning.

Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method

NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron, is introduced and it is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.

Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks

U-Tree, a reinforcement learning algorithm that uses selective attention and shortterm memory to simultaneously address the intertwined problems of large perceptual state spaces and hidden state, learns quickly, creates only task-relevant state distinctions, and handles noise well.

Accurate and interpretable regression trees using oracle coaching

The experiments show that the oracle coaching leads to significantly improved predictive performance, compared to standard induction, and it is also shown that a highly accurate opaque model can be successfully used as a pre-processing step to reduce the noise typically present in data, even in situations where production inputs are not available.

Beyond Sparsity: Tree Regularization of Deep Models for Interpretability

This work explicitly regularizes deep models so human users might step through the process behind their predictions in little time, and trains deep time-series models so their class-probability predictions have high accuracy while being closely modeled by decision trees with few nodes.

Logistic Model Trees

This paper uses a stagewise fitting process to construct the logistic regression models that can select relevant attributes in the data in a natural way, and shows how this approach can be used to build the logistics regression models at the leaves by incrementally refining those constructed at higher levels in the tree.

Interpretable Deep Models for ICU Outcome Prediction

This paper introduces a simple yet powerful knowledge-distillation approach called interpretable mimic learning, which uses gradient boosting trees to learn interpretable models and at the same time achieves strong prediction performance as deep learning models.

Do Deep Nets Really Need to be Deep?

This paper empirically demonstrate that shallow feed-forward nets can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models.