Depth-image-based rendering is a popular way to produce content for three-dimensional television and free viewpoint video, allowing the synthesis of numerous viewpoints using a single reference view and its depth map. Due to the synthesis process and nature of the input, artifacts and holes appear, and solving these problems becomes a challenge. In this letter, we propose solutions to remove those artifacts and apply different filling strategies depending on the nature of each hole. Cracks are identified and filled using very local neighborhood information. Regions classified as ghosts are projected to their correct place. The remaining holes are classified as disocclusions or out-of-field areas, and filled with an appropriate adaptation of a popular inpainting method. In both adaptations, patch matching explores the spatial locality concept, using dynamically adaptive patch sizes from the reference image. For disocclusions we propose a filling order using depth and background terms, and a searching process that considers only background patches. We show that our method outperforms several view synthesis methods in the quantitative evaluation, besides presenting consistent visual results for both large baselines and severely occluded scenes.
Read full abstract