Abstract

The creation of believable behaviors for Non-Player Characters (NPCs) is key to improve the players’ experience while playing a game. To achieve this objective, we need to design NPCs that appear to be controlled by a human player. In this paper, we propose a hierarchical reinforcement learning framework for believable bots (HRLB⌃2). This novel approach has been designed so it can overcome two main challenges currently faced in the creation of human-like NPCs. The first difficulty is exploring domains with high-dimensional state–action spaces, while satisfying constraints imposed by traits that characterize human-like behavior. The second problem is generating behavior diversity, by also adapting to the opponent’s playing style. We evaluated the effectiveness of our framework in the domain of the 2D fighting game named Street Fighter IV. The results of our tests demonstrate that our bot behaves in a human-like manner.

Highlights

  • In recent years, the Game artificial intelligence techniques (AI) community has made many efforts to accomplish a better understanding of how Theory of Flow constructs would be essential to improve current approaches in the player-centered subarea [1]

  • The first challenge we found is exploring domains with high-dimensional state–action spaces, while satisfying constraints imposed by traits that characterize human-like behavior

  • We evaluated the effectiveness of HRLB^2 in generating believable behaviors for Non-Player Characters (NPCs) in the domain of the 2D fighting game named Street Fighter IV

Read more

Summary

Introduction

The Game AI community has made many efforts to accomplish a better understanding of how Theory of Flow constructs would be essential to improve current approaches in the player-centered subarea [1]. We can define flow as a lasting and deep state of immersion [1]. To approach the design of believable NPCs, we need to identify which traits characterize a human-like behavior [3,4], and how those traits can be achieved through artificial intelligence techniques (AI) [1,5,6,7,8,9,10]. R : S × A → R is the reward function, with R(s, a) denoting the immediate numeric reward value obtained when the agent performs action a in state s. A policy, π, for an MDP is a function π : S → A that specifies the corresponding action a to be performed at each state s.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.