Abstract

This paper proposes a data-driven learning algorithm for the human-in-the-loop cooperative tracking control of multi-agent systems with completely unknown dynamics. The core role of the human is to supervise a team of agents via sending a control command to only the leader vehicle, where the leader is a non-autonomous agent that has a nonzero but bounded control input receiving the control command from the human operator. To effectively reduce communication frequency among vehicles, a distributed observer only using the relative input–output information is projected to approximate the relative state information of agents. Then, taking the transient state into account instead of the steady-state, a value iteration (VI) algorithm is proposed to solve the optimal output trajectory tracking control without an initial stabilizing control policy but sacrificing the convergence rate. Under the VI algorithm, a mixed iteration (MI) algorithm by making the full advantages of both VI and the traditional policy iteration algorithms is developed, which not only guarantees the convergence rate but also removes the requirement of an initial stabilizing control policy. Finally, a practical multi-robot network is provided to verify and compare the advantages of the proposed theoretical algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call