Abstract

At present, the application of mobile robots is more and more extensive, and the movement of mobile robots cannot be separated from effective navigation, especially path exploration. Aiming at navigation problems, this article proposes a method based on deep reinforcement learning and recurrent neural network, which combines double net and recurrent neural network modules with reinforcement learning ideas. At the same time, this article designed the corresponding parameter function to improve the performance of the model. In order to test the effectiveness of this method, based on the grid map model, this paper trains in a two-dimensional simulation environment, a three-dimensional TurtleBot simulation environment, and a physical robot environment, and obtains relevant data for peer-to-peer analysis. The experimental results show that the proposed algorithm has a good improvement in path finding efficiency and path length.

Highlights

  • As human beings, when we want to go to a strange place, we first try to understand the environment from the starting point to the destination, and subconsciously plan the most effective route in the brain

  • This article introduces a novel navigation method based on deep reinforcement learning (DRL), which is used for path planning of agents moving in different environments

  • In order to improve the feasibility of the navigation method, obstacles and map rewards and discount factors are redesigned according to the environment

Read more

Summary

Introduction

As human beings, when we want to go to a strange place, we first try to understand the environment from the starting point to the destination, and subconsciously plan the most effective route in the brain. It combines the perception of deep learning (DL) with the decisionmaking ability of reinforcement learning (RL) and directly controls the behavior of agents through high-dimensional perceptual input learning It provides a new idea for solving robot navigation problems. The selected action affects the immediate reinforcement value and affects the state of the moment of the environment and the final reward value. The max operation in standard Q-learning and DQN uses the same value to select and measure an action This may choose too high an estimate, resulting in an overly optimistic estimate of the value. In the depth-enhanced learning, the DL model is combined, the table is replaced by the neural network, and the corresponding decision result can be obtained by inputting the state. This article designs a corresponding reward generator based on the grid map to feedback the behavior of the agent. The training used stochastic gradient descent (SGD)[20] method that is a kind of local batch training way

Experiments and results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.