Learning while preventing mechanical failure due to random motions

  title={Learning while preventing mechanical failure due to random motions},
  author={Hendrik Jan Meijdam and M. Plooij and W. Caarls},
  journal={2013 IEEE/RSJ International Conference on Intelligent Robots and Systems},
Learning can be used to optimize robot motions to new situations. Learning motions can cause high frequency random motions in the exploration phase and can cause failure before the motion is learned. The mean time between failures (MTBF) of a robot can be predicted while it is performing these motions. The predicted MTBF in the exploration phase can be increased by filtering actions or possible actions of the algorithm. We investigated five algorithms that apply this filtering in various ways… Expand
Sample-Efficient Reinforcement Learning for Walking Robots
By learning to walk, robots should be able to traverse many types of terrains. An important learning paradigm for robots is Reinforcement Learning (RL). Learning to walk through RL with real robotsExpand
Evaluation of physical damage associated with action selection strategies in reinforcement learning
Inspired by the OU and PADA methods, four new action-selection methods are proposed in a systematic way and one of the proposed methods with a time-correlated noise outperforms the well-known e-greedy method in all three benchmarks. Expand
Safer reinforcement learning for robotics
Reinforcement learning is an active research area in the fields of artificial intelligence and machine learning, with applications in control. The most important feature of reinforcement learning isExpand
Parallel Online Temporal Difference Learning for Motor Control
  • W. Caarls, E. Schuitema
  • Computer Science, Medicine
  • IEEE Transactions on Neural Networks and Learning Systems
  • 2016
This paper shows that TD learning can work effectively in real robotic systems as well, using parallel model learning and planning, and achieves a speedup of almost two orders of magnitude over regular TD control on simulated control benchmarks. Expand
Generalized exploration in policy search
This paper introduces a unifying view on step-based and episode-based exploration that allows for such balanced trade-offs and evaluates the exploration strategy on four dynamical systems and shows that a more balancedtrade-off can yield faster learning and better final policies. Expand
Motion Design for Humanoids Based on Principal Component Analysis: Application to Human-Inspired Falling Motion Control
  • Miwa Masuda, J. Ishikawa
  • Computer Science
  • 2015 IEEE International Conference on Systems, Man, and Cybernetics
  • 2015
The experimental results showed that the humanoid achieved three quarters of the full rotation and that the proposed method could be utilized as a basis to introduce an iterative learning control to completely imitate the human-inspired forward-rolling motion. Expand
Deep Reinforcement Learning with Embedded LQR Controllers
  • W. Caarls
  • Computer Science, Engineering
  • ArXiv
  • 2021
This work introduces a method that integrates LQR control into the action set, allowing generalization and avoiding fixing the computed control in the replay memory if it is based on learned dynamics. Expand


Reinforcement Learning on autonomous humanoid robots
Service robots have the potential to be of great value in households, health care and other labor intensive environments. However, these environments are typically unique, not very structured andExpand
Policy gradient reinforcement learning for fast quadrupedal locomotion
  • Nate Kohl, P. Stone
  • Computer Science, Engineering
  • IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004
  • 2004
A machine learning approach to optimizing a quadrupedal trot gait for forward speed using a form of policy gradient reinforcement learning to automatically search the set of possible parameters with the goal of finding the fastest possible walk. Expand
Neural Reinforcement Learning Controllers for a Real Robot Application
How highly effective speed controllers can be learned from scratch on the real robot directly directly is described, using the recently developed neural fitted Q iteration scheme, which allows reinforcement learning of neural controllers with only a limited amount of training data seen. Expand
The design of LEO: A 2D bipedal walking robot for online autonomous Reinforcement Learning
This work derives the main hardware and software requirements that a RL robot should fulfill, and presents a biped robot LEO that was specifically designed to meet these requirements. Expand
Biped dynamic walking using reinforcement learning
This paper presents some results from a study of biped dynamic walking using reinforcement learning. During this study a hardware biped robot was built, a new reinforcement learning algorithm as wellExpand
Reinforcement Learning: An Introduction
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand
Efficient Model Learning Methods for Actor–Critic Control
Two new actor-critic algorithms for reinforcement learning that learn a process model and a reference model which represents a desired behavior are proposed, from which desired control actions can be calculated using the inverse of the learned process model. Expand
Mean Stress Effects in Stress-Life and Strain-Life Fatigue
Various approaches to estimating mean stress effects on stress-life and strain-life behavior are compared with test data for engineering metals. The modified Goodman equation with the ultimateExpand
Fatigue Failure Predictions for Complicated Stress-Strain Histories
Abstract : A cumulative damage procedure is developed to predict the fatigue failure of engineering metals subjected to complicated stress-strain histories. Histories with plastic strainings andExpand
Development of interlimb movement synchrony in the rat fetus.
Interlimb synchrony appears to be a robust method of quantifying fetal movement and may prove useful as a tool for assessing prenatal nervous system functioning. Expand