Abstract

Image inpainting aims to fill in corrupted regions with visually realistic and semantically plausible contents. In this paper, we propose a progressive image inpainting method, which is based on a forked-then-fused decoder network. A unit called PC-RN, which is the combination of partial convolution and region normalization, serves as the basic component to construct inpainting network. The PC-RN unit can extract useful features from the valid surroundings and can suppress incompleteness-caused interference at the same time. The forked-then-fused decoder network consists of a local reception branch, a long-range attention branch, and a squeeze-and-excitation-based fusing module. Two multi-scale contextual attention modules are deployed into the long-range attention branch for adaptively borrowing features from distant spatial positions. Progressive inpainting strategy allows the attention modules to use the previously filled region to reduce the risk of allocating wrong attention. We conduct extensive experiments on three benchmark databases: Places2, Paris StreetView, and CelebA. Qualitative and quantitative results show that the proposed inpainting model is superior to state-of-the-art works. Moreover, we perform ablation studies to reveal the functionality of each module for the image inpainting task.

Highlights

  • Image inpainting, which has been a research hotspot in the computer vision community, aims to fill in corrupted regions of an image with visually realistic and semantically plausible contents [1]

  • In this paper, we propose a novel end-to-end multi-stage pipeline mainly consisting of a shared encoder network and a forked--fused decoder network

  • The local reception branch is expected to infer the corrupted region conditioned on the valid surroundings

Read more

Summary

Introduction

Image inpainting, which has been a research hotspot in the computer vision community, aims to fill in corrupted regions of an image with visually realistic and semantically plausible contents [1]. The progressive inpainting strategies, in general, employ the learnable convolution kernels to perceive the periphery of the corrupted region but neglect the contextual information outside the receptive field To alleviate these problems, in this paper, we propose a novel end-to-end multi-stage pipeline mainly consisting of a shared encoder network and a forked--fused decoder network. The encoder network aims to capture the useful information from the valid region and to block out the objectionable interference derived from the corrupted region To this end, we design a new network unit, called PC-RN, which equips the partial convolutional layer [30] with the region-wise feature normalization [55]. The subscript t is dropped for clarity, unless explicitly needed to distinguish between multiple inpainting stages

Shared Encoder Network
Forked-Then-Fused Decoder Network
Local Reception Branch
Long-Range Attention Branch
Progressive Inpainting Strategy
Loss Function
Experiments
Experimental Setup
Qualitative Results
Quantitative Results
Ablation Studies
Ablation Study on the MSCA Module
Ablation Study on the SE-Based Fusing Module
Ablation Study on the Number of Inpainting Stages
Ablation Study on the Collaborative Effect between Inpainting Stages
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call