Abstract

Recently, the social dilemma problem is no longer limited to unrealistic stateless matrix games but has been extended to temporally and spatially extended Markov games by multi-agent reinforcement learning. Many multi-agent reinforcement-learning algorithms have been proposed to solve sequential social dilemmas. However, most current algorithms focus on cooperation to improve the overall reward while ignoring the equality among agents, which could be improved in terms of practicality. Here, we propose a novel admission-based hierarchical multi-agent reinforcement-learning algorithm to promote cooperation and equality among agents. We extend the give-or-take-some model to Markov games, decompose the fairness of each agent, and propose an Admission reward. For better learning, we design a hierarchy consisting of a high-level policy and multiple low-level policies, where the high-level policy maximizes the Admission reward by choosing different low-level policies to interact with environments. In addition, the learning and execution of policies are realized through a decentralized method. We conduct experiments in multiple sequential social dilemmas environments and show that the Admission algorithm significantly outperforms the baselines, demonstrating that our algorithm can learn cooperation and equality well.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call