1. Adaptive reinforcement learning with active state-specific exploration for engagement maximization during simulated child-robot interaction
- Author
-
Theodore Tsitsimis, George Velentzas, Iñaki Rañó, Costas S. Tzafestas, Mehdi Khamassi, School of of Electrical and Computer Engineering [Athens] (School of E.C.E), National Technical University of Athens [Athens] (NTUA), Faculty of Computing and Engineering [University of Ulster], University of Ulster, Institut des Systèmes Intelligents et de Robotique (ISIR), and Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
reinforcement learning ,Technology ,Meta learning (computer science) ,Computer science ,Cognitive Neuroscience ,joint action ,02 engineering and technology ,Robot learning ,Human–robot interaction ,Task (project management) ,03 medical and health sciences ,Behavioral Neuroscience ,human-robot interaction ,0302 clinical medicine ,meta-learning ,Developmental Neuroscience ,Artificial Intelligence ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,State space ,030206 dentistry ,Maximization ,Human-Computer Interaction ,Robot ,autonomous robotics ,[SDV.NEU]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC] ,020201 artificial intelligence & image processing ,active exploration ,engagement - Abstract
Using assistive robots for educational applications requires robots to be able to adapt their behavior specifically for each child with whom they interact.Among relevant signals, non-verbal cues such as the child’s gaze can provide the robot with important information about the child’s current engagement in the task, and whether the robot should continue its current behavior or not. Here we propose a reinforcement learning algorithm extended with active state-specific exploration and show its applicability to child engagement maximization as well as more classical tasks such as maze navigation. We first demonstrate its adaptive nature on a continuous maze problem as an enhancement of the classic grid world. There, parameterized actions enable the agent to learn single moves until the end of a corridor, similarly to “options” but without explicit hierarchical representations.We then apply the algorithm to a series of simulated scenarios, such as an extended Tower of Hanoi where the robot should find the appropriate speed of movement for the interacting child, and to a pointing task where the robot should find the child-specific appropriate level of expressivity of action. We show that the algorithm enables to cope with both global and local non-stationarities in the state space while preserving a stable behavior in other stationary portions of the state space. Altogether, these results suggest a promising way to enable robot learning based on non-verbal cues and the high degree of non-stationarities that can occur during interaction with children.
- Published
- 2018
- Full Text
- View/download PDF