Visual Attention and Color Cues for 6D Pose Estimation on Occluded Scenarios Using RGB-D Data.

Joel Vidal,Chyi-Yeu Lin,Robert Martí

doi:10.3390/s21238090

Joel Vidal, Chyi-Yeu Lin + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/s21238090

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Recently, 6D pose estimation methods have shown robust performance on highly cluttered scenes and different illumination conditions. However, occlusions are still challenging, with recognition rates decreasing to less than 10% for half-visible objects in some datasets. In this paper, we propose to use top-down visual attention and color cues to boost performance of a state-of-the-art method on occluded scenarios. More specifically, color information is employed to detect potential points in the scene, improve feature-matching, and compute more precise fitting scores. The proposed method is evaluated on the Linemod occluded (LM-O), TUD light (TUD-L), Tejani (IC-MI) and Doumanoglou (IC-BIN) datasets, as part of the SiSo BOP benchmark, which includes challenging highly occluded cases, illumination changing scenarios, and multiple instances. The method is analyzed and discussed for different parameters, color spaces and metrics. The presented results show the validity of the proposed approach and their robustness against illumination changes and multiple instance scenarios, specially boosting the performance on relatively high occluded cases. The proposed solution provides an absolute improvement of up to 30% for levels of occlusion between 40% to 50%, outperforming other approaches with a best overall recall of 71% for the LM-O, 92% for TUD-L, 99.3% for IC-MI and 97.5% for IC-BIN.

Highlights

The precise determination of an object’s location in a given scene is a key capability to archive flexible interaction and manipulation to perform autonomous operations [1,2].Traditionally, most computer vision research efforts have been centered on the detection and classification of objects in monocular images; only a few studies have focused on explicitly solving the full six degree-of-freedom problem
A novel solution based on visual attention and color cues for improving robustness against occlusion for 6D pose estimation using Point Pair Features voting approach has been presented
The method has been analyzed on different parameters, color spaces and metrics, showing a better performance for all tested color spaces for the single instance of a single object (SiSo) task on the widely used Linemod occluded (LM-O) dataset

Summary

Introduction

Most computer vision research efforts have been centered on the detection and classification of objects in monocular images; only a few studies have focused on explicitly solving the full six degree-of-freedom problem. In this line, most approaches face the problem from a 2D point of view, rather than inferring the precise rotation and position of the objects in the 3D space, commonly known as the 6D pose estimation problem. With the rise of machine learning, novel monocular methods appeared showing increasing levels of robustness [3] Most of these approaches were still limited to the understanding of the scene in terms of object classification, segmentation, and bounding box detection. Methods based on deep learning [4,5,6,7] have shown promising results in solving the problem from a 6D pose estimation perspective

Methods

Results

Conclusion