• Corpus ID: 235732066

Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics

@article{Botteghi2021LowDimensionalSA,
  title={Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics},
  author={Nicol{\`o} Botteghi and Mannes Poel and Beril Sirmaçek and Christoph Brune},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.01677}
}
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. However, in end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. In this work, we proposed a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one. Moreover, we seek… 

A template for the arxiv style

Recent interest in structure solution and refinement using electron diffraction (ED) has been fuelled by its inherent advantages when applied to crystals of sub-micron size, as well as a better

References

SHOWING 1-10 OF 37 REFERENCES

Joint State-Action Embedding for Efficient Reinforcement Learning

TLDR
A new approach for jointly embedding states and actions is proposed that combines aspects of model-free and model-based reinforcement learning, which can be applied in both discrete and continuous domains and significantly outperforms state-of-the-art models in discrete domains with large state/action space.

DeepMDP: Learning Continuous Latent Space Models for Representation Learning

TLDR
This work introduces the concept of a DeepMDP, a parameterized latent space model that is trained via the minimization of two tractable losses: prediction of rewards and prediction of the distribution over next latent states, and shows that the optimization of these objectives guarantees the quality of the latent space as a representation of the state space.

Contrastive Learning of Structured World Models

TLDR
These experiments demonstrate that C-SWMs can overcome limitations of models based on pixel reconstruction and outperform typical representatives of this model class in highly structured environments, while learning interpretable object-based representations.

Learning Action Representations for Reinforcement Learning

TLDR
This work provides an algorithm to both learn and use action representations and provide conditions for its convergence and the efficacy of the proposed method is demonstrated on large-scale real-world problems.

Approximate Homomorphisms : A framework for non-exact minimization in Markov Decision Processes

TLDR
This article introduces approximate homomorphisms that allow us to construct useful abstract models even when the homomorphism conditions are not met exactly, and presents a result on bounding the loss resulting from this approximation.

Low Dimensional State Representation Learning with Reward-shaped Priors

TLDR
This work proposes a method that aims at learning a mapping from the observations into a lower-dimensional state space using unsupervised learning using loss functions shaped to incorporate prior knowledge of the environment and the task.

Plannable Approximations to MDP Homomorphisms: Equivariance under Actions

TLDR
It is proved that when the loss is zero, the optimal policy in the abstract MDP can be successfully lifted to the original MDP, and a contrastive loss function is introduced that enforces action equivariance on the learned representations.

Controlling Assistive Robots with Learned Latent Actions

TLDR
A teleoperation algorithm for assistive robots that learns latent actions from task demonstrations is designed, and the controllability, consistency, and scaling properties that user-friendly latent actions should have are formulated, and how different lowdimensional embeddings capture these properties are evaluated.

PyRep: Bringing V-REP to Deep Robot Learning

TLDR
The new PyRep toolkit offers three improvements: a simple and flexible API for robot control and scene manipulation, a new rendering engine, and speed boosts upwards of 10,000x in comparison to the previous Python Remote API.

Symmetries and Model Minimization in Markov Decision Processes

TLDR
This work extends the model minimization framework proposed by Dean and Givan to include symmetries and base the framework on concepts derived from finite state automata and group theory.