Visual Navigation with Asynchronous Proximal Policy Optimization in Artificial Agents

Fanyu Zeng,Chen Wang

doi:10.1155/2020/8702962

Fanyu Zeng, Chen Wang

Open Access

https://doi.org/10.1155/2020/8702962

Copy DOI

Abstract

Vanilla policy gradient methods suffer from high variance, leading to unstable policies during training, where the policy’s performance fluctuates drastically between iterations. To address this issue, we analyze the policy optimization process of the navigation method based on deep reinforcement learning (DRL) that uses asynchronous gradient descent for optimization. A variant navigation (asynchronous proximal policy optimization navigation, appoNav) is presented that can guarantee the policy monotonic improvement during the process of policy optimization. Our experiments are tested in DeepMind Lab, and the experimental results show that the artificial agents with appoNav perform better than the compared algorithm.

Highlights

Navigation in an unstructured environment is one of the most important abilities for mobile robotics and artificial agents [1,2,3]
Traditional methods mainly divide navigation into several parts [4]: simultaneous localization and mapping (SLAM) [5,6,7], path planning [8], and semantic segmentation [9, 10]. e methods mentioned are not an end-to-end algorithm where each part is a challenging research subject, and the fusion of each part often leads to large computational errors
To reduce the fusion error, we focus on the end-to-end navigation based on deep reinforcement learning where navigational abilities could emerge as the byproduct of an artificial agent learning policy with reward maximization

Summary

Introduction

Navigation in an unstructured environment is one of the most important abilities for mobile robotics and artificial agents [1,2,3]. To reduce the fusion error, we focus on the end-to-end navigation based on deep reinforcement learning where navigational abilities could emerge as the byproduct of an artificial agent learning policy with reward maximization. DeepMind Lab can be used to study how autonomous artificial agents learn complex tasks in large, partially observed, and visually diverse worlds. Mirowski et al [21] proposed a DRL navigation method based on A3C [18], augmented with auxiliary learning targets, to train artificial agents to navigate in DeepMind Lab. For ease of expression, we call the DRL navigation using A3C as a3cNav. In this paper, the issues on policy optimization for navigation based on the vanilla policy gradient are analyzed; this type of navigation cannot control the change of expected advantage when an artificial agent learns to navigate in a maze. Experimental results show that an artificial agent via appoNav learns better navigation policy in DeepMind Lab and suffers from lower standard deviation than a3cNav

Related Work

Background

D1 rt–1

Approach

Experiments

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Robotics	Publication Date: Oct 14, 2020
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Visual Navigation with Asynchronous Proximal Policy Optimization in Artificial Agents

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Robotics

Lead the way for us

Similar Papers

A Survey on Visual Navigation for Artificial Agents With Deep Reinforcement Learning
Fanyu Zeng ... Chen Wang
IEEE Access | VOL. 8
Fanyu Zeng, et. al.Fanyu Zeng ... Chen Wang
01 Jan 2020
IEEE Access | VOL. 8

A Comparison of Dynamical Perceptual-Motor Primitives and Deep Reinforcement Learning for Human-Artificial Agent Training Systems
Lillian Rigoli ... Christopher Best
Journal of Cognitive Engineering and Decision Making | VOL. 16
Lillian Rigoli, et. al.Lillian Rigoli ... Christopher Best
25 Apr 2022
Journal of Cognitive Engineering and Decision Making | VOL. 16

Deep Reinforcement Learning: A New Frontier in Computer Vision Research
Sejuti Rahman ... Monisha Mushtary Uttsha
-
Sejuti Rahman, et. al.Sejuti Rahman ... Monisha Mushtary Uttsha
01 Jan 2020
01 Jan 2020

Sample effficient deep reinforcement learning for control

-

15 Dec 2019
15 Dec 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visual Navigation with Asynchronous Proximal Policy Optimization in Artificial Agents

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Robotics