Abstract

In this paper, we propose a novel Intrinsic Reward method with Peer Incentives (IRPI) to promote the inter-agent direct interactions and implicitly address the credit assignment problem in cooperative multi-agent reinforcement learning (MARL). The IRPI method can build mutual incentives between agents by using their causal effect, to realize their advanced cooperation. Specifically, a new intrinsic reward mechanism is conducted, which equips each agent with the ability to reward other agent by using the causal effect between them. Moreover, the mechanism is built through a neural network and learned by using causal effect between the agents. Furthermore, the counterfactual reasoning is used to infer the causal effect between the agents using the joint action-state value function, and then assess the quality of the effect using individual state value function in MARL. Simulational results in Starcraft II Micromanagement demonstrate that the proposed IRPI can enhance cooperation among the RL agents to achieve better performance than some state-of-the-art MARL methods in various cooperative multi-aaent tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.