• Corpus ID: 7292590

Graying the black box: Understanding DQNs

@article{Zahavy2016GrayingTB,
  title={Graying the black box: Understanding DQNs},
  author={Tom Zahavy and Nir Ben-Zrihem and Shie Mannor},
  journal={ArXiv},
  year={2016},
  volume={abs/1602.02658}
}
In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and… 
Towards Better Interpretability in Deep Q-Networks
TLDR
An interpretable neural network architecture for Q- learning is proposed which provides a global explanation of the model's behavior using key-value memories, attention and reconstructible embeddings and can reach training rewards comparable to the state-of-the-art deep Q-learning models.
Visualizing Dynamics: from t-SNE to SEMI-MDPs
TLDR
A novel method that automatically discovers an internal Semi Markov Decision Process (SMDP) model in the Deep Q Network's (DQN) learned representation is presented and a novel visualization method is suggested that represents the SMDP model by a directed graph and visualize it above a t-SNE map.
Deep Reinforcement Learning Discovers Internal Models
TLDR
This work presents the Semi-Aggregated MDP (SAMDP) model, a model best suited to describe policies exhibiting both spatial and temporal hierarchies, and describes its advantages for analyzing trained policies over other modeling approaches, and shows that under the right state representation, like that of DQN agents, SAMDP can help to identify skills.
Classifying Options for Deep Reinforcement Learning
TLDR
It is empirically show that the augmented DQN has lower sample complexity when simultaneously learning subtask with negative transfer, without degrading performance when learning subtasks with positive transfer.
Embedding High-Level Knowledge into DQNs to Learn Faster and More Safely
TLDR
This paper proposes a framework of Rule-interposing Learn (RIL) that embeds knowledge into deep reinforcement learning, and dynamically effect the training progress, and accelerate the learning.
Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents
TLDR
This work presents a method for synthesizing visual inputs of interest for a trained agent by learning a generative model over the state space of the environment and using its latent space to optimize a target function for the state of interest.
DEEP REINFORCEMENT LEARNING AGENTS
TLDR
This work presents a method for synthesizing visual inputs of interest for a trained agent by learning a generative model over the state space of the environment and using its latent space to optimize a target function for the state of interest.
REINFORCEMENT LEARNING THROUGH NEURAL ENCODING
TLDR
The purpose of the SAMDP modeling is to analyze trained policies by identifying temporal and spatial abstractions and it is shown that working with the right state representation mitigates the problem of finding spatial and temporal abstractions.
Learn to Interpret Atari Agents
Deep Reinforcement Learning (DeepRL) agents surpass human-level performances in a multitude of tasks. However, the direct mapping from states to actions makes it hard to interpret the rationale
Visual Diagnostics for Deep Reinforcement Learning Policy Development
TLDR
These extensions of CNN visualization algorithms to the domain of vision-based reinforcement learning provide insight into the qualities and flaws of trained policies when interacting with the physical world.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 67 REFERENCES
Visualizing Dynamics: from t-SNE to SEMI-MDPs
TLDR
A novel method that automatically discovers an internal Semi Markov Decision Process (SMDP) model in the Deep Q Network's (DQN) learned representation is presented and a novel visualization method is suggested that represents the SMDP model by a directed graph and visualize it above a t-SNE map.
Dueling Network Architectures for Deep Reinforcement Learning
TLDR
This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.
Deep Reinforcement Learning with Double Q-Learning
TLDR
This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.
Playing Atari with Deep Reinforcement Learning
TLDR
This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
TLDR
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.
Massively Parallel Methods for Deep Reinforcement Learning
TLDR
This work presents the first massively distributed architecture for deep reinforcement learning, using a distributed neural network to represent the value function or behaviour policy, and a distributed store of experience to implement the Deep Q-Network algorithm.
Spatio-Temporal Abstractions in Reinforcement Learning Through Neural Encoding
TLDR
The purpose of the SAMDP modeling is to describe and allow a better understanding of complex behaviors by identifying temporal and spatial abstractions and it is shown that working with the right state representation mitigates the problem of finding spatial and temporal abstractions.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
TLDR
h-DQN is presented, a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning, and allows for flexible goal specifications, such as functions over entities and relations.
A Deep Hierarchical Approach to Lifelong Learning in Minecraft
We propose a lifelong learning system that has the ability to reuse and transfer knowledge from one task to another while efficiently retaining the previously learned knowledge-base. Knowledge is
...
1
2
3
4
5
...