With the wide application of mobile robots in industry, path planning has always been a difficult problem for mobile robots. Reinforcement learning algorithms such as Q-learning play a huge role in path planning. Traditional Q-learning algorithm mainly uses - greedy search policy. But for a fixed search factor -greedy. For example, the problems of slow convergence speed, time-consuming and many continuous action transformations (such as the number of turns during robot movement) are not conducive to the stability requirements of mobile robots in industrial transportation. Especially for the transportation of dangerous chemicals, continuous transformation of turns will increase the risk of objects toppling. This paper proposes a new method based on - greedy 's improved dynamic search strategy is used to improve the stability of mobile robots in motion planning. The experiment shows that the dynamic search strategy converges faster, consumes less time, has less continuous transformation times of action, and has higher motion stability in the test environment.
Read full abstract