Abstract

Distributed multiagent reinforcement learning in the same environment is prohibitively hard, due to the difficulty of assigning credit for the individual actions of the agent, especially when the agent is a member of a team. Meanwhile, the sparse delayed reward about the team from the environment such as winning makes the learning progress more challenging. To solve the credit assignment and sparse delayed reward problems which are common in multiagent reinforcement learning, researchers usually construct or learn an internal reward signal that acts as a proxy for winning and provides denser rewards for individual agent. To improve the learning effect of a typical multiagent learning task, we conducted three types of internal rewards for multiagent team members and evaluated the effect of these rewards. The results show that not all internal reward can improve the learning effect of multiagent reinforcement learning, it seems that when the learning task is not very complex and the time of finishing the task for the team is not very long, the sparse reward such as winning have the best learning effect, and the learning effect of the other two forms of reward is not as good as that of the simple sparse reward. To some extent, our research results can provide a reference for the design of reward function in the application of distributed multi-agent reinforcement learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.