Integrating Symmetry into Differentiable Planning

  title={Integrating Symmetry into Differentiable Planning},
  author={Linfeng Zhao and Xu Zhu and Lingzhi Kong and Robin Walters and Lawson L. S. Wong},
We study how group symmetry helps improve data efficiency and generalization for end-to-end differentiable planning algorithms, specifically on 2D robotic path planning problems: navigation and manipulation. We first formalize the idea from Value Iteration Networks (VINs) on using convolutional networks for path planning, because it avoids explicitly constructing equivalence classes and enable endto-end planning. We then show that value iteration can always be represented as some convolutional… 


An investigation of model-free planning
It is demonstrated empirically that an entirely model-free approach, without special structure beyond standard neural network components such as convolutional networks and LSTMs, can learn to exhibit many of the characteristics typically associated with a model-based planner.
Universal Planning Networks
This work finds that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.
Differentiable Spatial Planning using Transformers
Spatial Planning Transformers (SPT) is proposed, which given an obstacle map learns to generate actions by planning over long-range spatial dependencies, unlike prior data-driven planners that propagate information locally via convolutional structure in an iterative manner.
Neural Algorithmic Reasoners are Implicit Planners
EXecuted Latent Value Iteration Networks (XLVINs) are proposed, which provide improvements to data efficiency against value iteration-based implicit planners, as well as relevant model-free baselines, and empirically verify that XLVINs can closely align with value iteration.
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
The MuZero algorithm is presented, which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics.
QMDP-Net: Deep Learning for Planning under Partial Observability
While QMDP-net encodes theQMDP algorithm, it sometimes outperforms the QM DP algorithm in the experiments, as a result of end-to-end learning.
Enhanced Symmetry Breaking in Cost-Optimal Planning as Forward Search
This work extends an effective framework for detecting and accounting for state symmetries within A* cost-optimal planning to allow for exploiting strictly larger symmetry classes, and thus pruning strictly larger parts of the search space.
The Detection and Exploitation of Symmetry in Planning Problems
A way of detecting and exploiting symmetry in the solution of problems that demonstrate these characteristics is described and a dramatic improvement in performance in solving problems exhibiting symmetry is achieved.
An algebraic approach to abstraction in reinforcement learning
This work introduces relativized options, a generalization of Markov sub-goal options, that allow us to define options without an absolute frame of reference and introduces an extension to the options framework that allows us to learn simultaneously at multiple levels of the hierarchy guarantees regarding the performance of hierarchical systems that employ approximate in several test-beds.
MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning
This paper introduces MDP homomorphic networks for deep reinforcement learning and introduces an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done.