Improved Saliency Detection in RGB-D Images Using Two-phase Depth Estimation and Selective Deep Fusion.

Chenglizhao Chen,Hong Qin,Chong Peng,Weizhong Zhang,Jipeng Wei

doi:10.1109/tip.2020.2968250

Chenglizhao Chen, Hong Qin + Show 3 more

Open Access

https://doi.org/10.1109/tip.2020.2968250

Copy DOI

Journal: IEEE Transactions on Image Processing	Publication Date: Jan 1, 2020
Citations: 138	License type: publisher-specific, author manuscript

Affiliation: Stony Brook University

Abstract

To solve the saliency detection problem in RGB-D images, the depth information plays a critical role in distinguishing salient objects or foregrounds from cluttered backgrounds. As the complementary component to color information, the depth quality directly dictates the subsequent saliency detection performance. However, due to artifacts and the limitation of depth acquisition devices, the quality of the obtained depth varies tremendously across different scenarios. Consequently, conventional selective fusion-based RGB-D saliency detection methods may result in a degraded detection performance in cases containing salient objects with low color contrast coupled with a low depth quality. To solve this problem, we make our initial attempt to estimate additional high-quality depth information, which is denoted by Depth+. Serving as a complement to the original depth, Depth+ will be fed into our newly designed selective fusion network to boost the detection performance. To achieve this aim, we first retrieve a small group of images that are similar to the given input, and then the inter-image, nonlocal correspondences are built accordingly. Thus, by using these inter-image correspondences, the overall depth can be coarsely estimated by utilizing our newly designed depth-transferring strategy. Next, we build fine-grained, object-level correspondences coupled with a saliency prior to further improve the depth quality of the previous estimation. Compared to the original depth, our newly estimated Depth+ is potentially more informative for detection improvement. Finally, we feed both the original depth and the newly estimated Depth+ into our selective deep fusion network, whose key novelty is to achieve an optimal complementary balance to make better decisions toward improving saliency boundaries.

Full Text