Abstract

This paper proposes a fuzzy reinforcement learning technique that enables a group of pursuers in pursuit-evasion (PE) differential games to learn how to capture a single superior evader in a decentralized manner. The superiority of the evader is in term of its maximum speed which means that this speed exceeds the maximum speed of the fastest pursuer in the game. The proposed learning technique uses a fuzzy actor-critic learning Automaton (FACLA) algorithm together with the so-called Apollonius circle technique and a specific formation control strategy which are used to define the necessary reward function for each pursuer. This enables each pursuer to update its value function accurately. Accordingly, the pursuer will take the right actions by tuning its fuzzy logic controller (FLC) parameters. The formation control strategy is also used such that during the capturing process the distribution angles of the pursuers around the evader are invariant as much as possible. Furthermore, it is also used to avoid a collision among them. It is assumed that the superior evader is an intelligent evader whose strategy is to continuously search for a gap during the evasion process by using the Apollonius circle method. If there is a gap, the evader will select its path through the gap to escape otherwise the evader will change its direction to increase the capture time. Simulation results are given to validate the proposed learning algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call