Abstract

In this paper, the mechanism of the human–computer game is investigated with the help of multi-agent systems (MASs) and reinforcement learning (RL). The game is formulated as a bipartite consensus problem while the interactions among humans and computers are modelled as a multi-agent system over a coopetition network. The coopetition network associated with the multi-agent system is represented by a signed graph, where positive/negative edges denote cooperative/competitive interactions. We assume the decision mechanism of the agents are model free and each agent has to make a distributed decision by learning the input and output data from himself/itself and his/its neighbours. The individual decision is developed with the neighbours' state information and a performance index function. A policy iteration (PI) algorithm is proposed to solve the Hamilton-Jacobi-Bellman equation and obtain the optimal decision strategy. Furthermore, an actor-critic neural network is adopted to approximate the performance index and the optimal decision strategy in an online manner. The simulation results are finally given to validate the proposed reinforcement learning approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call