A superior image inpainting scheme using Transformer-based self-supervised attention GAN model

Meili Zhou,Xiangzhen Liu,Tingting Yi,Zongwen Bai,Pei Zhang

doi:10.1016/j.eswa.2023.120906

Abstract

Image inpainting is the task of reconstructing the missing regions in diverse imaging as well as graphic applications which includes object elimination, image restoration, as well as manipulation. Although various deep learning techniques have attained a notable progress in restoration of images, because of local nature of different operations, they cannot effectively capture both global as well as semantic information, leading to issues such as structural ambiguity and semantic incompleteness, particularly for larger missing areas. To overcome these limitations, this article proposes a novel Transformer-based self-supervised attention-generating adversarial image painting method that leverages the Transformer's self-attention strategy to capture global semantic data. The proposed technique establishes a self-supervised attention module in the Transformer to overcome the limitations of convolutional operations. Additionally, a hierarchical Swin Transformer with a shift window is used in the discriminator to remove the image's contextual features, while semantic feature learning is performed using the transformer structure. The generator also employs a depthwise over-parameterized convolutional layer (DO-Conv) to extract features and enhance model performance. The experimental evaluations are carried out and result analysis demonstrated that the proposed technique outperformed various other existing approaches. Overall, the proposed method effectively addresses the limitations of existing approaches and demonstrates the superiority of the Transformer-based generative adversarial network structure over the full convolution approach.

Full Text