목표 기반 시각적 이동 작업을 위한 시각적 맥락정보 임베딩과 교착상태 처리

Jin-Hwan Kim,Jeong-Hyun Choi,Incheol Kim

doi:10.5302/j.icros.2023.22.0153

Jin-Hwan Kim, Jeong-Hyun Choi + Show 1 more

https://doi.org/10.5302/j.icros.2023.22.0153

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In this paper, we propose a novel deep neural network-based agent model, (VCENet), for performing target-driven visual navigation tasks. Most recent agent models for visual navigation try to recognize the real-time task context by using only objects and their relationships detected from RGB input images. However, such object-oriented visual context does not contain detailed information about the background scene, which is important for ground navigation. Moreover, it may result in a misrecognized context due to because of errors in of the object and relation detectors. To overcome these problems, the proposed VCENet model represents the real-time task context by using as the appearance and geometric features of the background scene extracted from RGB-D input images well as the object relation features derived through from graph embedding. Many existing models, which learn an action policiesy based upon on reinforcement, learning, do not provide any reward functions to avoid deadlocks ahead during of navigation or any effective deadlock recovery policyy learning mechanisms to escape the deadlocks. To address these problems, the proposed VCENet model provides not only reward functions for deadlock avoidance and recovery, but also a deadlock recovery policy based on imitation learning. Conducting a variety of various experiments in photo-realistic virtual indoor environments provided by the 3D simulator AI2THOR, we made sure of demonstrated the superiority of the proposed VCENet model.

Full Text