Abstract

An modeling other agents (MOA) constructs a model of other agents in every agent. It enables the agents to predict the actions of other agents and achieve coordinated and effective interactions in multi-agent systems. However, the relationship between the executed and predicted actions of agents is vague and diverse. To clarify the relationship, we proposed a method by which an agent through communications constructs its MOA using the historical data of other agents and asymmetrically treats itself and its MOA in a non-cooperative game to obtain Stackelberg equilibrium (SE). Subsequently, the SE are used to choose actions. We experimentally demonstrated that, in a partially observable and mixed cooperative-competitive environment, agents using our method with reinforcement learning could establish better coordination and engage in behaviors that are more appropriate compared to conventional methods. We then analyzed the coordinated interaction structure generated in the trained network to clarify the relationship between individual agents.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call