Abstract

In this paper, we focus on the challenges of training efficiency, the designation of reward functions, and generalization in reinforcement learning for visual navigation and propose a regularized extreme learning machine-based inverse reinforcement learning approach (RELM-IRL) to improve the navigation performance. Our contributions are mainly three-fold: First, a framework combining extreme learning machine with inverse reinforcement learning is presented. This framework can improve the sample efficiency and obtain the reward function directly from the image information observed by the agent and improve the generation for the new target and the new environment. Second, the extreme learning machine is regularized by multi-response sparse regression and the leave-one-out method, which can further improve the generalization ability. Simulation experiments in the AI-THOR environment showed that the proposed approach outperformed previous end-to-end approaches, thus, demonstrating the effectiveness and efficiency of our approach.

Highlights

  • With the rapid development of artificial intelligence technologies, the research and application of unmanned platforms, such as robots, has become a research hotspot

  • This paper proposes an inverse reinforcement learning (IRL) navigation control method based on a regularized extreme learning machine (RELM-IRL)

  • When the cumulative reward function expectations generated by all the strategies are not greater than the cumulative reward function expectations generated by the expert strategy, the reward function of Reinforcement learning (RL) will be the reward function learned from the expert data

Read more

Summary

Introduction

With the rapid development of artificial intelligence technologies, the research and application of unmanned platforms, such as robots, has become a research hotspot. Navigation methods based on DL require the robot to accurately position itself and require a large amount of human prior knowledge. This is not in line with the human way of thinking. Inverse reinforcement learning involves learning the reward function from expert data. Apprenticeship learning [30,31] is a type of IRL, which sets the prior basis function as the reward function. This ensures that the optimal strategy obtained from the reward function is near the expert strategy using the given expert data.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call