Abstract
This article focuses on using deep reinforcement learning, specifically proximity policy optimization, to train agents in a social dilemma game, modified dictator game, in order to investigate the effect of selfishness and altruism on the believability of the game agents. We present the design and implementation of the training environment, including the reward functions which are based on the findings of established empirical research, with three agent profiles mapped to the three standard constant elasticity of substitution (CES) utility functions, i.e., selfish, perfect substitutes, and Leontief, which measure different levels of selfishness/altruism. The trained models are validated and then used in a sample game, which is used to evaluate the believability of the three agent profiles using the agent believability metrics. The results indicate that players find altruistic behaviour more believable and consider selfishness less so. Analysis of the results indicates that human-like behavior resulting from the application of artificial intelligence evolves from perceived human behavior rather than the observed. The analysis also indicates that selfishness/altruism may be considered as an extra dimension to be included in the believability metrics.
Highlights
M OST video games would benefit from the application of Artificial Intelligence (AI) either from personalisation and management, to supporting progression through a game or through serving as opponents implemented as individual agents, groups of agents or central intelligence [1]
This paper presented an approach to create agents that exhibit different levels of selfishness and altruism in their behaviour
The agents are trained using deep reinforcement learning with Policy Optimization (PPO)
Summary
M OST video games would benefit from the application of Artificial Intelligence (AI) either from personalisation and management, to supporting progression through a game or through serving as opponents implemented as individual agents, groups of agents or central intelligence [1]. Producing human-like decision making behaviour would allow for more enjoyable gameplay experience. This would be achievable by using models generated through RL, especially when the reward function is mapped to utility functions based on human observation. Games such as Prisoner’s Dilemma and Dictator game have been used to guide research in social dilemmas. This research aims to investigate the use of RL to create believable game agents by training them to exhibit human-like altruistic behaviour. Presents the design and implementation of the RL approach that has been used to create altruistic agents, presenting the setting, reward functions and hyper parameters used during training. The paper concludes with discussion of the implications of the results and proposes future research directions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.