Abstract

In this paper, we propose a novel deep neural network-based agent model, (VCENet), for performing target-driven visual navigation tasks. Most recent agent models for visual navigation try to recognize the real-time task context by using only objects and their relationships detected from RGB input images. However, such object-oriented visual context does not contain detailed information about the background scene, which is important for ground navigation. Moreover, it may result in a misrecognized context due to because of errors in of the object and relation detectors. To overcome these problems, the proposed VCENet model represents the real-time task context by using as the appearance and geometric features of the background scene extracted from RGB-D input images well as the object relation features derived through from graph embedding. Many existing models, which learn an action policiesy based upon on reinforcement, learning, do not provide any reward functions to avoid deadlocks ahead during of navigation or any effective deadlock recovery policyy learning mechanisms to escape the deadlocks. To address these problems, the proposed VCENet model provides not only reward functions for deadlock avoidance and recovery, but also a deadlock recovery policy based on imitation learning. Conducting a variety of various experiments in photo-realistic virtual indoor environments provided by the 3D simulator AI2THOR, we made sure of demonstrated the superiority of the proposed VCENet model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.