Abstract

Active object detection (AOD) offers significant advantage in expanding the perceptual capacity of a robotics system. AOD is formulated as a sequential action decision process to determine optimal viewpoints to identify objects of interest in a visual scene. While reinforcement learning (RL) has been successfully used to solve many AOD problems, conventional RL methods suffer from (i) sample inefficiency, and (ii) unstable outcome due to inter-dependencies of action type (direction of view change) and action range (step size of view change). To address these issues, we propose a novel self-supervised RL method, which employs self-supervised representations of viewpoints to initialize the policy network, and a self-supervised loss on action range to enhance the network parameter optimization. The output and target pairs of self-supervised learning loss are automatically generated from the policy network online prediction and a range shrinkage algorithm (RSA), respectively. The proposed method is evaluated and benchmarked on two public datasets (T-LESS and AVD) using on-policy and off-policy RL algorithms. The results show that our method enhances detection accuracy and achieves faster convergence on both datasets. By evaluating on a more complex environment with a larger state space (where viewpoints are more densely sampled), our method achieves more robust and stable performance. Our experiment on real robot application scenario to disambiguate similar objects in a cluttered scene has also demonstrated the effectiveness of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.