Adaptive artificial companions learning from users’ feedback

Abir B Karami,Benoit Encelle,Karim Sehaba

doi:10.1177/1059712316634062

Abstract

Until recently, propositions on the subject of intelligent service companions, like robots, were mostly user and environment independent. Our work is part of the FUI-RoboPopuli project, which concentrates on endowing entertainment companion robots with adaptive and social behavior. More precisely, we focus on the capacity of an intelligent system to learn how to personalize and adapt its behavior/actions according to its interaction situation that describes (a) the current user attributes, and (b) the current environment attributes. Our approach is based on models of the type of Markov decision processes (MDPs) that are largely used for adaptive robot applications. In order to have, as quickly as possible, a relevant adaptive behavior whatever the interaction situation, several approaches were proposed to decrease the sample complexity required to learn the MDP model, including its reward function. We argue that systems that are based on detecting important attributes for each decision are more likely to converge faster than others. To this end, we present two algorithms to learn the MDP reward function through analyzing interaction traces (i.e., the interaction history between the robot and its users including their feedback regarding the robot actions). The first algorithm is direct, certain and does not particularly exploit its knowledge to adapt to unknown situations (i.e., unknown users’ and/or environment settings). The second is able to detect the importance of certain situation attributes in the adaptation process. The detection of important attributes is used to speed up the learning process and helps by generalizing the learned reward function to unknown situations. In this paper, we present both learning algorithms, simulated experiments and an experiment with the EMOX (EMOtion eXchange) robot that was upgraded during the FUI-RoboPopuli project. The results of those experiments prove that when dealing with adaptive decision making, the detection of important attributes for each decision speeds up the learning process and help in achieving convergence using fewer samples. We also present a scaling analysis through the simulated experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive artificial companions learning from users’ feedback

Abstract

Talk to us

Similar Papers

More From: Adaptive Behavior

Lead the way for us

Journal: Adaptive Behavior	Publication Date: Mar 22, 2016
Citations: 23

Similar Papers

Combining manual feedback with subsequent MDP reward signals for reinforcement learning
...
-
, et. al. ...
10 May 2010
10 May 2010

Complex-Valued Reinforcement Learning: a Context-Based Approach for POMDPs
Takeshi Shibuya ... Tomoki Hamagami
-
Takeshi Shibuya, et. al.Takeshi Shibuya ... Tomoki Hamagami
14 Jan 2011
14 Jan 2011

Reinforcement learning with Gaussian processes for condition-based maintenance
Shenglin Peng ... Qianmei (May) Feng
Computers & Industrial Engineering | VOL. 158
Shenglin Peng, et. al.Shenglin Peng ... Qianmei (May) Feng
16 Apr 2021
Computers & Industrial Engineering | VOL. 158

Introduction of Reinforcement Learning and Its Application Across Different Domain
Harshita Sharma ... Hritik Kumar
International Journal of Scientific Research in Computer Science, Engineering and Information Technology | VOL. -
Harshita Sharma, et. al. Harshita Sharma ... Hritik Kumar
05 Nov 2023
International Journal of Scientific Research in Computer Science, Engineering and Information Technology | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive artificial companions learning from users’ feedback

Abstract

Talk to us

Similar Papers

More From: Adaptive Behavior