• Corpus ID: 4669377

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

@inproceedings{Kulkarni2016HierarchicalDR,
  title={Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation},
  author={Tejas D. Kulkarni and Karthik Narasimhan and Ardavan Saeedi and Joshua B. Tenenbaum},
  booktitle={NIPS},
  year={2016}
}
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. [...] Key Method A top-level value function learns a policy over intrinsic goals, and a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our…Expand
Deep Reinforcement Learning with Temporal Abstraction and Intrinsic Motivation
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration,
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
TLDR
This paper introduces an open-source object interaction environment built using the MuJoCo physics engine and the CLEVR engine and finds that, using the approach, agents can learn to solve to diverse, temporally-extended tasks such as object sorting and multi-object rearrangement, including from raw pixel observations.
Learning Representations in Model-Free Hierarchical Reinforcement Learning
TLDR
This paper offers an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications, and demonstrates the efficiency of the method on two RL problems with sparse delayed feedback.
Unsupervised Methods For Subgoal Discovery During Intrinsic Motivation in Model-Free Hierarchical Reinforcement Learning
TLDR
This paper offers an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications, and demonstrates the efficiency of the method on two RL problems with sparse delayed feedback.
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction
Multiagent reinforcement learning (MARL) is commonly considered to suffer from non-stationary environments and exponentially increasing policy space. It would be even more challenging when rewards
Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning
TLDR
This work proposes the Hypothesis Proposal and Evaluation (HyPE) algorithm, which discovers objects from raw pixel data, generates hypotheses about the controllability of observed changes in object state, and learns a hierarchy of skills to test these hypotheses.
Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning
TLDR
A DRL algorithm that aims to improve data efficiency via both the utilization of unrewarded experiences and the exploration strategy by combining ideas from unsupervised auxiliary tasks, intrinsic motivation, and hierarchical reinforcement learning (HRL).
Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration
TLDR
A new type of intrinsic reward denoted as successor feature control (SFC) is introduced, which takes into account statistics over complete trajectories and thus differs from previous methods that only use local information to evaluate intrinsic motivation.
A Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes
TLDR
This paper proposes a hierarchical deep reinforcement learning approach for learning in hierarchical POMDP in which the tasks have only partial observability and possess hierarchical properties and proposes the deep hierarchical RL algorithm.
Learning Intrinsic Symbolic Rewards in Reinforcement Learning
TLDR
This paper presents a method that discovers dense rewards in the form of low-dimensional symbolic trees - thus making them more tractable for analysis and shows that the discovered dense rewards are an effective signal for an RL policy to solve the benchmark tasks.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 62 REFERENCES
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
TLDR
This paper considers the challenging Atari games domain, and proposes a new exploration method based on assigning exploration bonuses from a concurrently learned model of the system dynamics that provides the most consistent improvement across a range of games that pose a major challenge for prior methods.
Active learning of inverse models with intrinsically motivated goal exploration in robots
We introduce the Self-Adaptive Goal Generation Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture as an intrinsically motivated goal exploration mechanism which allows active learning of
Human-level control through deep reinforcement learning
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Recent Advances in Hierarchical Reinforcement Learning
TLDR
This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.
Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
TLDR
Results using Horde on a multi-sensored mobile robot to successfully learn goal-oriented behaviors and long-term predictions from off-policy experience are presented.
Deep Recurrent Q-Learning for Partially Observable MDPs
TLDR
The effects of adding recurrency to a Deep Q-Network is investigated by replacing the first post-convolutional fully-connected layer with a recurrent LSTM, which successfully integrates information through time and replicates DQN's performance on standard Atari games and partially observed equivalents featuring flickering game screens.
Learning Purposeful Behaviour in the Absence of Rewards
TLDR
This paper presents an algorithm capable of learning purposeful behaviour in the absence of rewards, and is particularly suited for settings where rewards are very sparse, and such behaviours can help in the exploration of the environment until reward is observed.
Subgoal Discovery for Hierarchical Reinforcement Learning Using Learned Policies
TLDR
This paper presents a method by which a reinforcement learning agent can discover subgoals with certain structural properties and including policies to subGoals as actions in its action set, so that the agent is able to explore more effectively and accelerate learning in other tasks in the same or similar environments where the sameSubgoals are useful.
Asynchronous Methods for Deep Reinforcement Learning
TLDR
A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Hierarchical Memory-Based Reinforcement Learning
TLDR
This paper shows how a hierarchy of behaviors can be used to create and select among variable length short-term memories appropriate for a task, and formalizes this idea in a framework called Hierarchical Suffix Memory (HSM).
...
1
2
3
4
5
...