Abstract

One of the main problems in cooperative multiagent learning is that the joint action space grows exponentially with the number of agents. In this paper, we investigate a sparse representation of the coordination dependencies between agents to employ roles and context-specific coordination graphs to reduce the joint action space. In our framework, the global joint Q-function is decomposed into a number of local Q-functions. Each local Q-function is shared among a small group of agents and is composed of a set of value rules. We propose a novel multiagent Q-learning algorithm which learns the weights in each value rule automatically. We give empirical evidence to show that our learning algorithm converges to the same optimal policy with a significantly faster speed than traditional multiagent learning techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call