Abstract

As vision and language processing techniques have made great progress, mapless-visual navigation is occupying uppermost position in domestic robot field. However, most current end-to-end navigation models tend to be strictly trained and tested on identical datasets with stationary structure, which leads to great performance degradation when dealing with unseen targets and environments. Since the targets of same category could possess quite diverse features, generalization ability of these models is also limited by their visualized task description. In this article we propose a model-agnostic metalearning based text-driven visual navigation model to achieve generalization to untrained tasks. Based on meta-reinforcement learning approach, the agent is capable of accumulating navigation experience from existing targets and environments. When applied to finding a new object or exploring in a new scene, the agent will quickly learn how to fulfill this unfamiliar task through relatively few recursive trials. To improve learning efficiency and accuracy, we introduce fully convolutional instance-aware semantic segmentation and Word2vec into our DRL network to respectively extract visual and semantic features according to object class, creating more direct and concise linkage between targets and their surroundings. Several experiments have been conducted on realistic dataset Matterport3D to evaluate its target-driven navigation performance and generalization ability. The results demonstrate that our adaptive navigation model could navigate to text-defined targets and achieve fast adaption to untrained tasks, outperforming other state-of-the-art navigation approaches.

Highlights

  • Nowadays substantial researches have been carried out in mapless robot navigation field

  • Matterport3D environment consists of 10800 panoramic views from 194400 RGB images in 90 scenes with 7189 paths sampled from its navigation graphs

  • A navigation episode is supposed to completed if the target instance described by text input is within the field of view and the agent arrives at its nearest viewpoint, or it has taken 10000 actions failing to find target

Read more

Summary

Introduction

Nowadays substantial researches have been carried out in mapless robot navigation field. Agents governed by goalbased tasks are designed to navigate only depending on visual information with little prior knowledge of the environment, resulting in less system cost and power consumption. In addition to image processing, mapless visual navigation requires agent to interact with the environment efficiently, where deep reinforcement learning method has been adopted. DQN [1]and A3C [2], considered as the most representative RL algorithms, are widely. Used in navigation field to realize interactive process. Based on such end-to-end learning mechanism, navigation model is enabled to eliminate errors accumulated from traditional engineering projects, such as extracting visual features, making map, identifying object location and planning path. The performance of the whole system can be greatly improved and maintained

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.