Abstract

We investigate the problem of multi-agent reinforcement learning, in which each agent only has access to its local reward and can only communicate with its nearby neighbors. A distributed algorithm based on actor-critic method has been developed to enable all agents to cooperatively learn a control policy that maximizes the global objective function. Simulations are also provided to validate the proposed algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call