Abstract

Occlusion removal in an image can aid in facilitating the robustness of numerous computer vision tasks, e.g., detection and tracking in surveillance. However, the invisible property of contents behind the occlusions limits occlusion removal from the single view. Recently, the emerging light field (LF) data, which contains rich multi-view perception of the scene, provides potential solution for this challenge. To better exploit the capability of occlusion location and occluded contents recovery from LF data, in this paper, we propose a LF occlusion removal network (LFORNet), which consists of three key sub-networks: the foreground occlusion location (FOL) sub-network, the background content recovery (BCR) sub-network, and the refinement sub-network. Specifically, both FOL sub-network and BCR sub-network explore the multi-view information of LF data, and thus they are constructed with the same network structure to estimate the occlusion mask and the coarse occluded contents map, respectively. The refinement sub-network aggregates the above two outputs to obtain the refined occlusion removal. Meanwhile, we use multi-angle view stacks as the input of the network, which can make full use of the inherent information among the LF views. Experimental results show that our method is suitable for different sizes of occlusions, and surpasses the state-of-the-art approaches in both synthetic and real-world scenes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call