Improved Q‐Learning Method for Multirobot Formation and Path Planning with Concave Obstacles

Zhilin Fan,Hongyong Yang,Yilin Han,Xinshun Ning,Li Liu,Jian Wang,Fei Liu

doi:10.1155/2021/4294841

Abstract

Aiming at the formation and path planning of multirobot systems in an unknown environment, a path planning method for multirobot formation based on improved Q‐learning is proposed. Based on the leader‐following approach, the leader robot uses an improved Q‐learning algorithm to plan the path and the follower robot achieves a tracking strategy of gravitational potential field (GPF) by designing a cost function to select actions. Specifically, to improve the Q‐learning, Q‐value is initialized by environmental guidance of the target’s GPF. Then, the virtual obstacle‐filling avoidance strategy is presented to fill non‐obstacles which is judged to tend to concave obstacles with virtual obstacles. Besides, the simulated annealing (SA) algorithm whose controlling temperature is adjusted in real time according to the learning situation of the Q‐learning is applied to improve the action selection strategy. The experimental results show that the improved Q‐learning algorithm reduces the convergence time by 89.9% and the number of convergence rounds by 63.4% compared with the traditional algorithm. With the help of the method, multiple robots have a clear division of labor and quickly plan a globally optimized formation path in a completely unknown environment.

Highlights

As robots become more and more widely used in various industries, a single robot cannot be competent for complex tasks
(1) Select an action at at state st according to ε‐greedy; % ε‐greedy is the action selection strategy; (2) Execute the action at, enter state st+1 and get a reward rt; %Get immediate rewards by performing actions to interact with environment (3) Update Qðst, atÞ using Qðst, atÞ = Qðst, atÞ + α1⁄2rt + γ maaxQðst+1, aÞ − Qðst, atÞ; % Update the value function according to the update equation by using the reward (4) st ⟵ st+1; %Update state end-while Episode = episode + 1; % Update episode end-for end Algorithm 1: Classical Q-learning algorithm
The steps of the tracking strategy based on gravitational potential field (GPF) for the follower robot are as follows: Step 1: if the follower robot obtains the coordinates broadcast by the leader robot, it will determine the target state according to the formation, i.e., the desired target position at this time

Summary

Introduction

As robots become more and more widely used in various industries, a single robot cannot be competent for complex tasks. Based on the path planning of a single robot, Sruthi et al [11] designed a nonlinear controller for tracking to achieve the multirobot formation. By mixing formation control with leader-following and priority methods, Sang et al [12] used the MTAPF algorithm with an improved A∗ algorithm for path planning. The above methods all initialize the Q-value by some prior information to improve the algorithm, without considering the avoidance of concave obstacles and the adjustment of the action selection strategy. The innovation in this paper is as follows: The improved Q-learning algorithm is presented to plan paths, in which environmental guidance and virtual obstacle-filling avoidance strategy are added to accelerate convergence and the SA algorithm is applied to improve the action selection strategy; the follower robot can achieve the tracking strategy of GPF by designing the cost function to select actions

Related Methods

Improved Q-Learning Proposed for Path Planning of Leader Robot

A Path Planning Method for multirobot Formation

Experiments Analysis

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Sensors	Publication Date: Jan 1, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improved Q‐Learning Method for Multirobot Formation and Path Planning with Concave Obstacles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Sensors

Lead the way for us

Similar Papers

A Path Planning Method for Multi-robot Formation Based on Improved Q-Learning
Zhilin Fan ... Hongyong Yang
-
Zhilin Fan, et. al.Zhilin Fan ... Hongyong Yang
06 Oct 2021
06 Oct 2021

Coverage Rolling Path Planning of Unknown Environments with Dynamic Heuristic Searching
Xiaoqin Guo
-
Xiaoqin GuoXiaoqin Guo
01 Jan 2009
01 Jan 2009

Improved reinforcement learning algorithm for mobile robot path planning
Teng Luo ... L Nguyen
ITM Web of Conferences | VOL. 47
Teng Luo, et. al.Teng Luo ... L Nguyen
01 Jan 2021
ITM Web of Conferences | VOL. 47

New Real Time (M-Bug) Algorithm for Path Planning and Obstacle Avoidance In 2D Unknown Environment
Ahmed Mohamed Mohsen ... Mohamed Saad Zaghlol
-
Ahmed Mohamed Mohsen, et. al.Ahmed Mohamed Mohsen ... Mohamed Saad Zaghlol
29 Oct 2019
29 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved Q‐Learning Method for Multirobot Formation and Path Planning with Concave Obstacles

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Sensors