Abstract

Shallow depth-of-field (DoF), focusing on the region of interest by blurring out the rest of the image, is challenging in computer vision and computational photography. It can be achieved either by adjusting the parameters (e.g., aperture and focal length) of a single-lens reflex camera or computational techniques. In this paper, we investigate the latter one, i.e., explore a computational method to render shallow DoF. The previous methods either rely on portrait segmentation or stereo sensing, which can only be applied to portrait photos and require stereo inputs. To address these issues, we study the problem of rendering shallow DoF from an arbitrary image. In particular, we propose a method that consists of a salient object detection (SOD) module, a monocular depth prediction (MDP) module, and a DoF rendering module. The SOD module determines the focal plane, while the MDP module controls the blur degree. Specifically, we introduce a label-guided ranking loss for both salient object detection and depth prediction. For salient object detection, the label-guided ranking loss comprises two terms: (i) heterogeneous ranking loss that encourages the sampled salient pixels to be different from background pixels; (ii) homogeneous ranking loss penalizes the inconsistency of salient pixels or background pixels. For depth prediction, the label-guided ranking loss mainly relies on multilevel structural information, i.e., from low-level edge maps to high-level object instance masks. In addition, we introduce a SOD and depth-aware blur rendering method to generate shallow DoF images. Comprehensive experiments demonstrate the effectiveness of our proposed method.

Highlights

  • Breathtaking photography is all about narrative, i.e., the story the image is telling

  • We present an automatic system consisting of a salient object detection (SOD) module, an monocular depth prediction (MDP) module, and a DoF rendering module for rendering realistic shallow DoF from an arbitrary image

  • This paper presents an automatic shallow DoF system consisting of a SOD module, an MDP module, and a DoF rendering module

Read more

Summary

Introduction

Breathtaking photography is all about narrative, i.e., the story the image is telling. We step further to study the problem of rendering shallow DoF effects from an unconstrained image To this end, we propose a method that consists of a salient object detection (SOD). Due to a large number of parameters, the fully connected layer decreases computational efficiency To address this issue, several methods adopt a Fully Convolutional Network (FCN) to generate pixel-wise saliency maps. Training with such a loss suffers from interclass indistinction and interclass inconsistency To mitigate this issue, we propose a label-guided ranking loss that explicitly models the neighboring relationships. We propose a label-guided ranking loss that explicitly models the neighboring relationships This operation is similar to the visual attention mechanism of primates (i.e., center-surrounding differences [19,20])

Monocular Depth Prediction
DoF Rendering
Method
DoF rendering
Network Architecture
Shallow DoF
Ablation Studies
Comparison with State-of-the-Arts
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call