Reinforcement Learning for Classical Planning: Viewing Heuristics as Dense Reward Generators

  title={Reinforcement Learning for Classical Planning: Viewing Heuristics as Dense Reward Generators},
  author={Clement Gehring and Masataro Asai and Rohan Chitnis and Tom Silver and Leslie Pack Kaelbling and Shirin Sohrabi and Michael Katz},
Recent advances in reinforcement learning (RL) have led to a growing interest in applying RL to classical planning domains or applying classical planning methods to some complex RL domains. However, the long-horizon goal-based problems found in classical planning lead to sparse rewards for RL, making direct application inefficient. In this paper, we propose to leverage domain-independent heuristic functions commonly used in the classical planning literature to improve the sample efficiency of… 

Figures and Tables from this paper

Scaling Directed Controller Synthesis via Reinforcement Learning

The proposed synthesis algorithm is framed as an RL task with an unbounded action space and a modified version of DQN is used and it is shown that it is possible to learn heuristics on small versions of a problem in a way that generalizes to the larger instances.

Planning Large-scale Object Rearrangement Using Deep Reinforcement Learning

The proposed deep-RL based task planning method is the first one that demonstrates the rearrangement across different scenarios from 2D surfaces such as tabletops to 3D rooms with a large number of objects and without any explicit need of buffer space.

Hybrid Deep RePReL: Integrating Relational Planning and Reinforcement Learning for Information Fusion

A novel neuro-symbolic system, Hybrid Deep RePReL, is introduced that achieves the best of both worlds-the generalization ability of the planner with the effective learning ability of deep RL.

Abstract Interpretation for Generalized Heuristic Search in Model-Based Planning

It is illustrated how abstract interpretation can serve as a unifying framework for these abstraction-based heuristics, extending the reach of heuristic search to richer world models that make use of more complex datatypes and functions.

Action Space Reduction for Planning Domains

An automated way of reducing the action spaces of Reinforcement Learning environments, by leveraging lifted mutex groups is proposed, showing a significant reduction in the action space size of the RL environments.

CACLA-Based Local Path Planner for Drones Navigating Unknown Indoor Corridors

An online local path planning approach for autonomous drone navigating a 2D plane in an unknown, indoor corridor-like environment and incorporation of a special experience replay memory for the better converging tendency of drone toward the balancing point is presented.

PG3: Policy-Guided Planning for Generalized Policy Generation

This work studies generalized policy search-based methods with a focus on the score function used to guide the search over policies, and demonstrates limitations of two score functions --- policy evaluation and plan comparison --- and proposes a new approach that overcomes these limitations.

Hierarchical Reinforcement Learning with AI Planning Models

This work proposes an integrative approach that combines high-level planning with RL, retaining interpretability, transfer, and efficiency, while allowing for robust learning of the lower-level plan actions.



Neural Logic Machines

The Neural Logic Machine is proposed, a neural-symbolic architecture for both inductive learning and logic reasoning that achieves perfect generalization in a number of tasks, from relational reasoning tasks on the family tree and general graphs, to decision making tasks including sorting arrays, finding shortest paths, and playing the blocks world.

Relational inductive biases, deep learning, and graph networks

It is argued that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective.

Generalized Planning With Deep Reinforcement Learning

This work studies the use of Deep Reinforcement Learning and Graph Neural Networks to learn generalized policies and demonstrates that they can generalize to instances that are orders of magnitude larger than those they were trained on.

Inductive Logic Programming

This book constitutes the thoroughly refereed post-proceedings of the 23rd International Conference on Inductive Logic Programming, ILP 2013, held in Rio de Janeiro, Brazil, in August 2013. The 9

Goal distance estimation for automated planning using neural networks and support vector machines

This paper analyzes the performance of standalone learned heuristics based on nature-inspired machine learning techniques and employs a comparison to the standard FF heuristic and other heuristic learning approaches.

Meta Reinforcement Learning for Heuristic Planing

This work makes a connection between meta-reinforcement learning and heuristic planning, showing that heuristic functions meta-learned from planning problems, in a given domain, can outperform both popular domain-independent heuristics, and heuristic learned by supervised learning.

Factor Graph Neural Networks

Neural Network Heuristics for Classical Planning: A Study of Hyperparameter Space

A study of hyperparameter space in the most canonical setup: input = state, feed-forward NN, supervised learning, generalization only over initial state, investigates a broad range ofhyperparameters pertaining to NN design and training.