Abstract

Visual navigation (vNavigation) is a key and fundamental technology for artificial agents' interaction with the environment to achieve advanced behaviors. Visual navigation for artificial agents with deep reinforcement learning (DRL) is a new research hotspot in artificial intelligence and robotics that incorporates the decision making of DRL into visual navigation. Visual navigation via DRL, an end-to-end method, directly receives the high-dimensional images and generates an optimal navigation policy. In this paper, we first present an overview on reinforcement learning (RL), deep learning (DL) and deep reinforcement learning (DRL). Then, we systematically describe five main categories of visual DRL navigation: direct DRL vNavigation, hierarchical DRL vNavigation, multi-task DRL vNavigation, memory-inference DRL vNavigation and vision-language DRL vNavigation. These visual DRL navigation algorithms are reviewed in detail. Finally, we discuss the challenges and some possible opportunities to visual DRL navigation for artificial agents.

Highlights

  • INTRODUCTIONArtificial agents refer to software or hardware entities that can perform actions in an environment independently, and include virtual robots (such as characters in games and entities in virtual environments) and real robots (such as service robots, industrial robots, and unmanned vehicles)

  • Artificial agents refer to software or hardware entities that can perform actions in an environment independently, and include virtual robots and real robots

  • Laser simultaneous localization and mapping (SLAM) has achieved some success in recent years, the high price of laser sensors hinders the practical application of laser SLAM, and the efficiency of laser SLAM is susceptible to the poor weather conditions, The associate editor coordinating the review of this manuscript and approving it for publication was Junchi Yan

Read more

Summary

INTRODUCTION

Artificial agents refer to software or hardware entities that can perform actions in an environment independently, and include virtual robots (such as characters in games and entities in virtual environments) and real robots (such as service robots, industrial robots, and unmanned vehicles). Both PTAM and ORB-SLAM are based on feature extraction, but the feature method cannot process texture images well To address this issue, Engel et al [8] proposed LSD-SLAM which is a direct (feature-less) visual SLAM algorithm, and LSD-SLAM enables the construction of large-scale and consistent maps of the environment. One prominent issue is their susceptibility to sensor noises accumulation that propagates down the pipeline from the mapping, localization to path planning, leading these algorithms with less robust performance They require extensive case-specific scenario-driven manual-engineering, making traditional navigation difficult to integrate with other downstream artificial intelligent tasks that have achieved superior performance with the learning methods, such as visual recognition, question answering, and other advanced intelligent tasks [10].

DEEP REINFORCEMENT LEARNING
DIRECT DRL vNavigation
CURRENT CHALLENGES AND OPPORTUNITIES
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call