Abstract

To cope with the curse of dimensionality, an ubiquitous problem in multi-agent reinforcement learning, this paper deals with the multi-agent learning in a new perspective and proposes a new algorithm, the optimal tracking agent (OTA). The OTA treats the other agents as a part of the system and uses an estimator to track the dynamics of the system. Thus, it obtains the dynamic model with limit accuracy and uses the model-based reinforcement learning to react optimally to the system. All the processes are just from one agent's perspective, then the searching space for action is just its own and not exponential with the number of agents any more. Thus, the curse of dimensionality is relieved from action space. Experiment illustrates the validity and efficiency of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call