Abstract

As a powerful and intelligent machine learning method, reinforcement learning (RL) has been widely used in many fields such as game theory, adaptive control, multi-agent system, nonlinear forecasting, and so on. The main contribution of this technique is its exploration and exploitation approaches to find the optimal solution or semi-optimal solution of goal-directed problems. However, when RL is applied to multi-agent systems (MASs), problems such as “curse of dimension”, “perceptual aliasing problem”, and uncertainty of the environment constitute high hurdles to RL. Meanwhile, although RL is inspired by behavioral psychology and reward/punishment from the environment is used, higher mental factors such as affects, emotions, and motivations are rarely adopted in the learning procedure of RL. In this paper, to challenge agents learning in MASs, we propose a computational motivation function, which adopts two principle affective factors “Arousal” and “Pleasure” of Russell’s circumplex model of affects, to improve the learning performance of a conventional RL algorithm named Q-learning (QL). Compared with the conventional QL, computer simulations of pursuit problems with static and dynamic preys were carried out, and the results showed that the proposed method results in agents having a faster and more stable learning performance.

Highlights

  • Graduate School of Science and Engineering, Yamaguchi University, Tokiwadai 2-16-1, Ube, School of Information Science & Technology, Aichi Prefectural University, Ibaragabasama 152203, Tel.: +81-836-85-9520; Fax: +81-836-85-9501

  • The concept of reinforcement has been introduced in artificial intelligence (AI) from the 1950s and, as a bio-inspired machine learning method, reinforcement learning (RL) has been developed rapidly since the 1980s [1]

  • We propose adopting affect factors into conventional RL to improve the learning performance of RL in multi-agent systems (MASs)

Read more

Summary

Introduction

“Reinforcement” is first used by Pavlov in his famous conditioned reflex theory of the 1920s. These affect factors are abstracted in a two-basic dimension space with “Pleasant-Unpleasant” (valence dimension) and “High Activation-Low Activation” (arousal dimension) axes by Larson and Diener [27] Using these emotional factors, Ide and Nozawa introduced a series of rules to drive robots to pull or push each other to realize the avoidance of obstacles and cooperatively find a goal in an unknown environment [25,26]. Ide and Nozawa introduced a series of rules to drive robots to pull or push each other to realize the avoidance of obstacles and cooperatively find a goal in an unknown environment [25,26] To overcome problems such as dead-lock and multiple goal exploration when the complex environment was applied, Kuremoto et al improved the emotion-to-behavior rules by adding another psychological factor: “curiosity” in [29,30].

Russell’s Affect Model
Emotion Function
A Motivation Function
Policy to Select an Action
Learning Algorithm
Definition of Pursuit Problem
Results of Simulation with a Static Prey
Results of Simulation with a Dynamic Prey
Discussions
Conclusions and Future Works
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.