Abstract

Machine learning technology helps multi-robot systems to carry out desired tasks in an unknown dynamic environment. In this paper, we extend the single-agent Q-learning algorithm to a multi-robot box-pushing system in an unknown dynamic environment with random obstacle distribution. There are two kinds of extensions available: directly extending MDP (Markov decision process) based Q-learning to the multi-robot domain, and SG-based (stochastic game based) Q-learning. Here, we select the first kind of extension because of its simplicity. The learning space, the box dynamics, and the reward function etc. are presented in this paper. Furthermore, a simulation system is developed and its results show effectiveness, robustness and adaptivity of this learning-based multi-robot system. Our statistical analysis of the results also shows that the robots learned correct cooperative strategy even in a dynamic environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call