Abstract

The payoff of an agent depends on both the environment and the actions of other agents. Thus, the ability to model and predict the strategies and behaviors of other agents in an interactive decision-making scenario is one of the core functionalities in intelligent systems. State-of-the-art methods for opponent modeling mainly use an explicit model of opponents’ actions, preferences, targets, etc., that the primary agent uses to make decisions. It is more important for an agent to increase its payoff than to accurately predict opponents’ behavior. Therefore, we propose a framework synchronizing the opponent modeling and decision making of the primary agent by incorporating opponent modeling into reinforcement learning. For interactive decisions, the payoff depends not only on the behavioral characteristics of the opponent but also the current state. However, confounding the two obscures the effects of state and action, which then cannot be accurately encoded. To this end, state evaluation is separated from action evaluation in our model. The experimental results from two game environments, a simulated soccer game and a real game called quiz bowl, show that the introduction of opponent modeling can effectively improve decision payoffs. In addition, the proposed framework for opponent modeling outperforms benchmark models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call