Abstract

Abstract. Autonomously exploring and mapping is one of the open challenges of robotics and artificial intelligence. Especially when the environments are unknown, choosing the optimal navigation directive is not straightforward. In this paper, we propose a reinforcement learning framework for navigating, exploring, and mapping unknown environments. The reinforcement learning agent is in charge of selecting the commands for steering the mobile robot, while a SLAM algorithm estimates the robot pose and maps the environments. The agent, to select optimal actions, is trained to be curious about the world. This concept translates into the introduction of a curiosity-driven reward function that encourages the agent to steer the mobile robot towards unknown and unseen areas of the world and the map. We test our approach in explorations challenges in different indoor environments. The agent trained with the proposed reward function outperforms the agents trained with reward functions commonly used in the literature for solving such tasks.

Highlights

  • The problem of autonomous robot navigation is traditionally tackled by employing environment representations, i.e. maps, that are used to plan a collision-free path to reach specific target locations

  • The reinforcement learning algorithm, i.e. DDPG, only relies on 80 2D-LiDAR readings, the robot’s pose estimate coming from the Rao Blackwellized particle filter (RBPF) Simultaneous localization and mapping (SLAM) algorithm, the previous action taken by the agent, the percentage of the map to be explored, and the time steps left before the end of the episode

  • 4.2 Exploration by Reward Shaping We propose an adaptation of the episodic curiosity reward introduced by (Savinov et al, 2018) for improving the exploration skills of the reinforcement learning algorithm in the context of active SLAM and we investigate its effect on the generalization skills of the trained agent to different environment topologies

Read more

Summary

Introduction

The problem of autonomous robot navigation is traditionally tackled by employing environment representations, i.e. maps, that are used to plan a collision-free path to reach specific target locations. These indoor maps are usually constructed using Simultaneous Localization and Mapping, or SLAM, algorithms (Thrun et al, 2005). Successful reinforcement learning-based solutions are proposed in (Wu et al, 2018), (Tai et al, 2017), (Zhelo et al, 2018), (Pfeiffer et al, 2017),(Zhang et al, 2018), and (Zhang et al, 2020) Such map-less path planners often require long training times and a high amount of data to perform well. The actor network π(s; θπ) is updated with the deterministic policy gradient theorem and it is shown in Equation (1)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.