A Decentralized Partially Observable Markov Decision Model with Action Duration for Goal Recognition in Real Time Strategy Games

Peng Jiao,Lin Sun,Shiguang Yue,Xiangyu Wei,Kai Xu

doi:10.1155/2017/4580206

Peng Jiao, Lin Sun + Show 3 more

Open Access

https://doi.org/10.1155/2017/4580206

Copy DOI

Abstract

Multiagent goal recognition is a tough yet important problem in many real time strategy games or simulation systems. Traditional modeling methods either are in great demand of detailed agents’ domain knowledge and training dataset for policy estimation or lack clear definition of action duration. To solve the above problems, we propose a novel Dec-POMDM-T model, combining the classic Dec-POMDP, an observation model for recognizer, joint goal with its termination indicator, and time duration variables for actions with action termination variables. In this paper, a model-free algorithm named cooperative colearning based on Sarsa is used. Considering that Dec-POMDM-T usually encounters multiagent goal recognition problems with different sorts of noises, partially missing data, and unknown action durations, the paper exploits the SIS PF with resampling for inference under the dynamic Bayesian network structure of Dec-POMDM-T. In experiments, a modified predator-prey scenario is adopted to study multiagent joint goal recognition problem, which is the recognition of the joint target shared among cooperative predators. Experiment results show that (a) Dec-POMDM-T works effectively in multiagent goal recognition and adapts well to dynamic changing goals within agent group; (b) Dec-POMDM-T outperforms traditional Dec-MDP-based methods in terms of precision, recall, andF-measure.

Highlights

More and more commercial real time strategy (RTS) games have received attention from AI researchers, behavior scientists, policy evaluators, and staff training groups [1]
Based on ideas from Dec-POMDM and SMDPs [10], we propose a novel decentralized partially observable Markov decision model with time duration (Dec-POMDM-T) to formalize multiagent cooperative behaviors with durative actions
We propose the Dec-POMDM-T for formalizing the world states, behaviors, goals, and action durations in goal recognition problem

Summary

Introduction

More and more commercial real time strategy (RTS) games have received attention from AI researchers, behavior scientists, policy evaluators, and staff training groups [1]. Though many AI planning and decision-making algorithms have been applied to agents in RTS games, their behavior patterns are still easy to be predicted and making games less entertaining or intuitive This is partially because of agents’ low information processing and understanding ability, for example, the recognition of goal or intention from opponents or friends. If AI players can recognize the real moving or attacking target, they will be better prepared, no matter with early defense employment or counter decision-making. Considering these benefits, goal recognition has attracted lots of attention from researchers in many different fields. Many related models and algorithms have been proposed and applied, such as hidden Markov models (HMMs) [4], conditional random fields (CRFs) [5], Markov decision processes (MDPs) [6], and particle filtering (PF) [7]

Objectives

Methods

Results

Conclusion