• Corpus ID: 235795636

Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing

  title={Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing},
  author={Kaixin Wang and Kuangqi Zhou and Qixin Zhang and Jie Shao and Bryan Hooi and Jiashi Feng},
The Laplacian representation recently gains increasing attention for reinforcement learning as it provides succinct and informative representation for states, by taking the eigenvectors of the Laplacian matrix of the state-transition graph as state embeddings. Such representation captures the geometry of the underlying state space and is beneficial to RL tasks such as option discovery and reward shaping. To approximate the Laplacian representation in large (or even continuous) state spaces… 

Reachability-Aware Laplacian Representation in Reinforcement Learning

A Reachability-Aware Laplacian Representation ( RA-LapRep) is introduced, which can better capture the inter-state reachability as compared to LapRep, through both theoretical explanations and experimental results.

Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs

This paper shows how to directly compute multi-agent options with collaborative exploratory behaviors while still enjoying the ease of decomposition, and proposes a deep learning extension of the method by estimating eigenfunctions through NN-based representation learning techniques.

Multi-agent Covering Option Discovery through Kronecker Product of Factor Graphs

The proposed algorithm can success-fully identify multi-agent options, and significantly outperforms prior works using single- agent options or no options, in terms of both faster exploration and higher cumulative rewards.

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

The proposed multiagent option discovery approach addresses this problem by alleviating the exponential complexity involved in multi-agent explorations and achieves significantly improved exploration and higher cumulative rewards in challenging multi- agent decision making scenarios.

Multi-agent Covering Option Discovery based on Kronecker Product of Factor Graphs

This paper shows that it is indeed possible to directly compute multi-agent options with collaborative exploratory behaviors among the agents, while still enjoying the ease of decomposition and approximate the joint state space as a Kronecker graph, based on which the proposed algorithm can successfully identify multi- agent options.

Temporal Abstraction in Reinforcement Learning with the Successor Representation

This paper argues that the successor representation, which encodes states based on the pattern of state visitation that follows them, can be seen as a natural substrate for the discovery and use of temporal abstractions and takes a big picture view of recent results, showing how it can be used to discover options that facilitate either temporally-extended exploration or planning.



The Laplacian in RL: Learning Representations with Efficient Approximations

This paper presents a fully general and scalable method for approximating the eigenvectors of the Laplacian in a model-free RL context, and empirically shows that it generalizes beyond the tabular, finite-state setting.

Count-Based Exploration with the Successor Representation

A simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required and achieves state-of-the-art performance in Atari 2600 games when in a low sample-complexity regime.

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

A theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states is introduced and it is demonstrated that PSEs improve generalization on diverse benchmarks, including LQR with spurious correlations, a jumping task from pixels, and Distracting DM Control Suite.

Exploration in Reinforcement Learning with Deep Covering Options

Deep covering options is introduced, an online method that extends covering options to large state spaces, automatically discovering task-agnostic options that encourage exploration and substantially improving both the exploration and the total accumulated reward.

Decoupling Representation Learning from Reinforcement Learning

A new unsupervised learning task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations separated by a short time difference, under image augmentations and using a contrastive loss.

A Laplacian Framework for Option Discovery in Reinforcement Learning

This paper addresses the option discovery problem by showing how PVFs implicitly define options by introducing eigenpurposes, intrinsic reward functions derived from the learned representations, which traverse the principal directions of the state space.

Curiosity-Driven Exploration by Self-Supervised Prediction

This work forms curiosity as the error in an agent's ability to predict the consequence of its own actions in a visual feature space learned by a self-supervised inverse dynamics model, which scales to high-dimensional continuous state spaces like images, bypasses the difficulties of directly predicting pixels, and ignores the aspects of the environment that cannot affect the agent.

SpectralNet: Spectral Clustering using Deep Neural Networks

A deep learning approach to spectral clustering that overcomes the major limitations of scalability and generalization of the spectral embedding and applies VC dimension theory to derive a lower bound on the size of SpectralNet.

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world

Spectral Inference Networks: Unifying Deep and Spectral Learning

The results demonstrate that Spectral Inference Networks accurately recover eigenfunctions of linear operators and can discover interpretable representations from video in a fully unsupervised manner.