Abstract

Area coverage control with multi-agent systems is formulated here in the framework of Bellman’s optimality. In the literature, optimal configurations are obtained by Lloyd’s algorithm, which iteratively converges the agents to centroidal Voronoi configurations that correspond to local maxima of a coverage metric, where metric sub-domains are assigned to each agent according to Voronoi tessellations. In this work, optimal tracking is achieved by an adaptive control policy using an actor–critic neural network-based reinforcement learning technique, which rewards actions that decrease a Lyapunov function based on the area coverage metric to drive the error dynamics to zero, and uses a feed-forward neural network to approximate the tracking control signal. The implementation of the area-coverage control is model-free in the sense that it relies on neural networks that interpolate local data gathered and shared by each agent, without relying on a global model of the system. We prove that optimality is achieved when the agents converge to Voronoi centroids, therefore maximizing the coverage metric, with the important implication that the obtained class of solutions is consistent with the ones obtained with Lloyd’s algorithm and its extension to non-autonomous systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call