• Corpus ID: 211258767

How Transferable are the Representations Learned by Deep Q Agents?

  title={How Transferable are the Representations Learned by Deep Q Agents?},
  author={Jacob Tyo and Zachary Chase Lipton},
In this paper, we consider the source of Deep Reinforcement Learning (DRL)'s sample complexity, asking how much derives from the requirement of learning useful representations of environment states and how much is due to the sample complexity of learning a policy. While for DRL agents, the distinction between representation and policy may not be clear, we seek new insight through a set of transfer learning experiments. In each experiment, we retain some fraction of layers trained on either the… 

Figures from this paper

Component Transfer Learning for Deep RL Based on Abstract Representations

This work investigates a specific transfer learning approach for deep reinforcement learning in the context where the internal dynamics between two tasks are the same but the visual representations differ, and finds that the transfer performance is heavily reliant on the base model.

Investigating the Properties of Neural Network Representations in Reinforcement Learning

This analysis allows us to provide novel hypotheses regarding impact of auxiliary tasks in end-to-end training of non-linear reinforcement learning methods, and develops a method to better understand why some representations work better for transfer.

On The Transferability of Deep-Q Networks

The results show that transferring neural networks in a DRL context can be particularly challenging and is a process which in most cases results in negative transfer, and in the attempt of understanding why Deep-Q Networks transfer so poorly, novel insights are gained into the training dynamics that characterizes this family of algorithms.

Emergent Representations in Reinforcement Learning and Their Properties

This thesis empirically investigates how emergent representations learned with different tasks settings relate to historical notions of good representations, and provides novel insights regarding end-to-end training, the auxiliary task effect, and the utility of successor-feature targets.

State Action Separable Reinforcement Learning

Experiments show that sasRL outperforms state-of-the-art MDP-based RL algorithms by up to 75% and a light-weight transition model is learned to assist the agent to determine the action that triggers the associated state transition.



Dueling Network Architectures for Deep Reinforcement Learning

This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

A Distributional Perspective on Reinforcement Learning

This paper argues for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent, and designs a new algorithm which applies Bellman's equation to the learning of approximate value distributions.

Noisy Networks for Exploration

It is found that replacing the conventional exploration heuristics for A3C, DQN and dueling agents with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.

Rainbow: Combining Improvements in Deep Reinforcement Learning

This paper examines six extensions to the DQN algorithm and empirically studies their combination, showing that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance.

Sparse multi-task reinforcement learning

This paper develops two multi-task extensions of the fitted Q-iteration algorithm that assume that the tasks are jointly sparse in the given representation and learns a transformation of the features in the attempt of finding a more sparse representation.

The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)

The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed.

Prioritized Experience Replay

A framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently, in Deep Q-Networks, a reinforcement learning algorithm that achieved human-level performance across many Atari games.

Effective Control Knowledge Transfer through Learning Skill and Representation Hierarchies

A learning architecture which transfers control knowledge in the form of behavioral skills and corresponding representation concepts from one task to subsequent learning tasks and can significantly outperform learning on a flat state space representation and the MAXQ method for hierarchical reinforcement learning.

Mastering the game of Go with deep neural networks and tree search

Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.