Abstract

For the task under Markov Decision Process, this paper presents a novel multiagent Reinforcement Learning (RL) with perception and conversion action mechanism that learning agents observe adversary agent and convert adversarial action to learning agents' corresponding action as observing state variation incurred by the adversary agent in the task environment during learning processes. Meanwhile, this paper surveys inexpensive communication ways among learning agents utilizing both the direct communication and the indirect media communication to realize agents' cooperation. The direct communication is realized by sharing sensation; the indirect media communication is realized by updating reinforcement values on the common environment observation. Then, a multiagent RL algorithm, Q-ac multiagent RL method, is proposed. By perception and conversion action, the learning agents extend learning episodes and derive more observation by less action. The direct communication enhances agents' observation ability to the environment, and the indirect media communication improves agents' ability deriving the optimal action policy. The simulation results on hunter game demonstrate the efficiency of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call