Interpretable Option Discovery Using Deep Q-Learning and Variational Autoencoders
@article{Andersen2022InterpretableOD, title={Interpretable Option Discovery Using Deep Q-Learning and Variational Autoencoders}, author={Per-Arne Andersen and Morten Goodwin and Ole-Christoffer Granmo}, journal={ArXiv}, year={2022}, volume={abs/2210.01231} }
. Deep Reinforcement Learning (RL) is unquestionably a ro-bust framework to train autonomous agents in a wide variety of disci-plines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions [18] is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is…
References
SHOWING 1-10 OF 25 REFERENCES
Deep Reinforcement Learning with Double Q-Learning
- Computer ScienceAAAI
- 2016
This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.
SVQN: Sequential Variational Soft Q-Learning Networks
- Computer ScienceICLR
- 2020
A novel algorithm is proposed, named sequential variational soft Q-learning networks (SVQNs), which formalizes the inference of hidden states and maximum entropy reinforcement learning (MERL) under a unified graphical model and optimizes the two modules jointly.
Towards Better Interpretability in Deep Q-Networks
- Computer ScienceAAAI
- 2019
An interpretable neural network architecture for Q- learning is proposed which provides a global explanation of the model's behavior using key-value memories, attention and reconstructible embeddings and can reach training rewards comparable to the state-of-the-art deep Q-learning models.
Deep Reinforcement Learning: A Brief Survey
- Computer ScienceIEEE Signal Processing Magazine
- 2017
This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.
A Brief Survey of Deep Reinforcement Learning
- Computer ScienceArXiv
- 2017
This survey will cover central algorithms in deep reinforcement learning, including the deep Q-network, trust region policy optimisation, and asynchronous advantage actor-critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via reinforcement learning.
DQNViz: A Visual Analytics Approach to Understand Deep Q-Networks
- Computer ScienceIEEE Transactions on Visualization and Computer Graphics
- 2019
This work proposes DQNViz, a visual analytics system to expose details of the blind training process in four levels, and enable users to dive into the large experience space of the DQN agent for comprehensive analysis, and demonstrates that it can effectively help domain experts to understand, diagnose, and potentially improve D QN models.
Recent Advances in Hierarchical Reinforcement Learning
- Computer ScienceDiscret. Event Dyn. Syst.
- 2003
This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.
Human-level control through deep reinforcement learning
- Computer ScienceNature
- 2015
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
End-to-End Training of Deep Visuomotor Policies
- Computer ScienceJ. Mach. Learn. Res.
- 2016
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
Rainbow: Combining Improvements in Deep Reinforcement Learning
- Computer ScienceAAAI
- 2018
This paper examines six extensions to the DQN algorithm and empirically studies their combination, showing that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance.