Corpus ID: 229348756

Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement Learning

@inproceedings{Gillen2020ExplicitlyEL,
  title={Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement Learning},
  author={Sean Patrick Gillen and Katie Byl},
  booktitle={CoRL},
  year={2020}
}
A key limitation in using various modern methods of machine learning in developing feedback control policies is the lack of appropriate methodologies to analyze their long-term dynamics, in terms of making any sort of guarantees (even statistically) about robustness. The central reasons for this are largely due to the so-called curse of dimensionality, combined with the black-box nature of the resulting control policies themselves. This paper aims at the first of these issues. Although the full… Expand

Figures and Tables from this paper

Mesh Based Analysis of Low Fractal Dimension Reinforcement Learning Policies
  • S. Gillen, Katie Byl
  • Computer Science
  • 2021 IEEE International Conference on Robotics and Automation (ICRA)
  • 2021
TLDR
This work builds meshes of the reachable state space of a system subject to disturbances and controlled by policies obtained with the modified reward, and shows that agents trained with the fractal dimension reward transfer their desirable quality of having a more compact state space to a setting with external disturbances. Expand
Direct Random Search for Fine Tuning of Deep Reinforcement Learning Policies
TLDR
A direct random search is very effective at fine-tuning DRL policies by directly optimizing them using deterministic rollouts, and can be used to extend previous work on shrinking the dimensionality of the reachable state space of closed-loop systems run under Deep Neural Network policies. Expand

References

SHOWING 1-10 OF 16 REFERENCES
Simple random search of static linear policies is competitive for reinforcement learning
TLDR
This work introduces a model-free random search algorithm for training static, linear policies for continuous control problems and evaluates the performance of this method over hundreds of random seeds and many different hyperparameter configurations for each benchmark task. Expand
‘H’
  • P. Alam
  • Composites Engineering: An A–Z Guide
  • 2021
Learning dexterous in-hand manipulation
TLDR
This work uses reinforcement learning (RL) to learn dexterous in-hand manipulation policies that can perform vision-based object reorientation on a physical Shadow Dexterous Hand, and these policies transfer to the physical robot despite being trained entirely in simulation. Expand
Learning agile and dynamic motor skills for legged robots
TLDR
This work introduces a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. Expand
Mesh-based Methods for Quantifying and Improving Robustness of a Planar Biped Model to Random Push Disturbances
TLDR
This paper applies meshing tools to improve and analyze the performance of a 5-link planar biped model to random push perturbations, and conducts simulations on two different sets of trajectories to validate the effectiveness of these tools. Expand
Mesh-based Tools to Analyze Deep Reinforcement Learning Policies for Underactuated Biped Locomotion
TLDR
A mesh-based approach to analyze stability and robustness of the policies obtained via deep reinforcement learning for various biped gaits of a five-link planar model, motivated by the twin hypotheses that contraction of dynamics can simplify the required complexity of a control policy and that control policies obtain via deep learning may exhibit tendency to contract to lower-dimensional manifolds within the full state space. Expand
Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning
TLDR
This paper presents an approach based on model-free Deep Reinforcement Learning to control recovery maneuvers of quadrupedal robots using a hierarchical behavior-based controller that manifests dynamic and reactive recovery behaviors to recover from an arbitrary fall configuration within less than 5 seconds. Expand
Emergence of Locomotion Behaviours in Rich Environments
TLDR
This paper explores how a rich environment can help to promote the learning of complex behavior, and finds that this encourages the emergence of robust behaviours that perform well across a suite of tasks. Expand
Mesh-based switching control for robust and agile dynamic gaits
TLDR
In planning agile motions for the authors' legged system model, the mesh-based policies predict future dynamics robustly for plans up to about a 5-step horizon, and in quantifying controller sets, it is emphasized that both the number of and parameterizations for such controllers should be considered in tandem during optimization. Expand
and W
  • Zaremba. Openai gym
  • 2016
...
1
2
...