Abstract

Removing a specific object from an image and replacing the hole left behind with visually plausible backgrounds is a very intriguing task. While recent deep learning based object removal methods have shown promising results on this task for some structured scenes, none of them have addressed the problem of object removal in facial images. The objective of this work is to remove microphone object in facial images and fill hole with correct facial semantics and fine details. To make our solution practically useful, we present an interactive method called MRGAN, where the user roughly provides the microphone region. For filling the hole, we employ a Generative Adversarial Network based image-to-image translation approach. We break the problem into two stages: inpainter and refiner. The inpainter estimates coarse prediction by roughly filling in the microphone region followed by the refiner which produces fine details under the microphone region. We unite perceptual loss, reconstruction loss and adversarial loss as joint loss function for generating a realistic face and similar structure to the ground truth. Because facial image pairs with or without microphone do not exist, we have trained our method on a synthetically generated microphone dataset from CelebA face images and evaluated on real world microphone images. Our extensive evaluation shows that MRGAN performs better than state-of-the-art image manipulation methods on real microphone images although we only train our method using the synthetic dataset created. Additionally, we provide ablation studies for the integrated loss function and for different network arrangements.

Highlights

  • Facial expressions are an important part of daily life communication

  • Because we want our work to be deployed in the real world as a working application, we take an interactive approach that removes the microphone part in facial images by manually providing the microphone region in an image

  • We experimentally show that MRGAN effectively removes the microphone object and generate plausible semantics in facial images than previous stat-of-the-art image manipulation methods

Read more

Summary

Introduction

Facial expressions are an important part of daily life communication. Smile can present our acceptance of a message while a scowl might indicate disagreement. The goal of this research is to remove the microphone object in facial images. It involves detection of the microphone part, and inpainting of the holes left behind with plausible correct contents. This problem is challenging because (1) the result heavily depends on the accuracy of detection of the microphone region, (2) it is not easy to recover complex semantics of the face under the microphone region detected, and (3) training data, i.e., facial image pairs with and without microphone are sparse or non-existent. After the user selects the microphone region not exact, our algorithm successfully fills in the left behind hole with correct facial semantics with fine

Objectives
Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.