Interpretable Option Discovery Using Deep Q-Learning and Variational Autoencoders

@article{Andersen2022InterpretableOD,
  title={Interpretable Option Discovery Using Deep Q-Learning and Variational Autoencoders},
  author={Per-Arne Andersen and Morten Goodwin and Ole-Christoffer Granmo},
  journal={ArXiv},
  year={2022},
  volume={abs/2210.01231}
}
. Deep Reinforcement Learning (RL) is unquestionably a ro-bust framework to train autonomous agents in a wide variety of disci-plines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions [18] is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is… 

References

SHOWING 1-10 OF 25 REFERENCES

Deep Reinforcement Learning with Double Q-Learning

This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

SVQN: Sequential Variational Soft Q-Learning Networks

A novel algorithm is proposed, named sequential variational soft Q-learning networks (SVQNs), which formalizes the inference of hidden states and maximum entropy reinforcement learning (MERL) under a unified graphical model and optimizes the two modules jointly.

Towards Better Interpretability in Deep Q-Networks

An interpretable neural network architecture for Q- learning is proposed which provides a global explanation of the model's behavior using key-value memories, attention and reconstructible embeddings and can reach training rewards comparable to the state-of-the-art deep Q-learning models.

Deep Reinforcement Learning: A Brief Survey

This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.

A Brief Survey of Deep Reinforcement Learning

This survey will cover central algorithms in deep reinforcement learning, including the deep Q-network, trust region policy optimisation, and asynchronous advantage actor-critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via reinforcement learning.

DQNViz: A Visual Analytics Approach to Understand Deep Q-Networks

This work proposes DQNViz, a visual analytics system to expose details of the blind training process in four levels, and enable users to dive into the large experience space of the DQN agent for comprehensive analysis, and demonstrates that it can effectively help domain experts to understand, diagnose, and potentially improve D QN models.

Recent Advances in Hierarchical Reinforcement Learning

This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

End-to-End Training of Deep Visuomotor Policies

This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.

Rainbow: Combining Improvements in Deep Reinforcement Learning

This paper examines six extensions to the DQN algorithm and empirically studies their combination, showing that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance.