Corpus ID: 23763602

Online adaptation to human engagement perturbations in simulated human- robot interaction using hybrid reinforcement learning

  title={Online adaptation to human engagement perturbations in simulated human- robot interaction using hybrid reinforcement learning},
  author={Theodore Tsitsimis and George Velentzas and Mehdi Khamassi and Costas S. Tzafestas},
Dynamic uncontrolled human-robot interaction requires robots to be able to adapt to changes in the human’s behavior and intentions. Among relevant signals, non-verbal cues such as the human’s gaze can provide the robot with important information about the human’s current engagement in the task, and whether the robot should continue its current behavior or not. In a previous work [1] we proposed an active exploration algorithm for reinforcement learning where the reward function is the weighted… Expand

Figures and Tables from this paper


Active Exploration and Parameterized Reinforcement Learning Applied to a Simulated Human-Robot Interaction Task
This work proposes an active exploration algorithm for RL in structured (parameterized) continuous action space and shows that it outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm. Expand
Evaluating the Engagement with Social Robots
This paper introduces a set of metrics useful in direct, face to face scenarios, based on the behaviors analysis of the human partners, and shows how such metrics are useful to assess how the robot is perceived by humans and how this perception changes according to the behaviors shown by the social robot. Expand
Policy search for motor primitives in robotics
A novel EM-inspired algorithm for policy learning that is particularly well-suited for dynamical system motor primitives is introduced and applied in the context of motor learning and can learn a complex Ball-in-a-Cup task on a real Barrett WAM™ robot arm. Expand
Robot Skill Learning: From Reinforcement Learning to Evolution Strategies
It is striking that PI2 and (μW, λ)-ES share a common core, and that the simpler algorithm converges faster and leads to similar or lower final costs, which is due to a third trend in robot skill learning: the predominant use of dynamic movement primitives. Expand
Reinforcement learning in robotics: A survey
This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes. Expand
Deep Reinforcement Learning in Parameterized Action Space
This paper represents a successful extension of deep reinforcement learning to the class of parameterized action space MDPs within the domain of simulated RoboCup soccer, which features a small set of discrete action types each of which is parameterized with continuous variables. Expand
Reinforcement Learning in Continuous Action Spaces
  • H. van Hasselt, M. Wiering
  • Computer Science
  • 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
  • 2007
This work presents a new class of algorithms named continuous actor critic learning automaton (CACLA) that can handle continuous states and actions and shows that CACLA performs much better than the other algorithms, especially when it is combined with a Gaussian exploration method. Expand
Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning
A dual-system computational model is developed that can predict both performance and reaction times during learning of a stimulus–response association task and a model is proposed for QL and BWM coordination such that the expensive memory manipulation is under control of, among others, the level of convergence of the habitual learning. Expand
Meta-learning in Reinforcement Learning
It is suggested that the phasic and tonic components of dopamine neuron firing can encode the signal required for meta-learning of reinforcement learning. Expand
Reinforcement Learning with Parameterized Actions
The Q-PAMDP algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters is introduced and it is shown that it converges to a local optimum. Expand