Abstract
Context. The problem of the image impainting in computer graphic and computer vision systems is considered. The subject of the research is deep learning convolutional neural networks for image inpainting.
 Objective. The objective of the research is to improve the image inpainting performance in computer vision and computer graphics systems by applying wavelet transform in the LaMa-Fourier network architecture.
 Method. The basic LaMa-Fourier network decomposes the image into global and local texture. Then it is proposed to improve the network block, processing the global context of the image, namely, the spectral transform block. To improve the block of spectral transform, instead of Fourier Unit Structure the Simple Wavelet Convolution Block elaborated by the authors is used. In this block, 3D wavelet transform of the image on two levels was initially performed using the Daubechies wavelet db4. The obtained coefficients of 3D wavelet transform are splitted so that each subband represents a separate feature of the image. Convolutional layer, batch normalization and ReLU activation function are sequentially applied to the results of splitting of coefficients on each level of wavelet transform. The obtained subbands of wavelet coefficients are concatenated and the inverse wavelet transform is applied to them, the result of which is the output of the block. Note that the wavelet coefficients at different levels were processed separately. This reduces the computational complexity of calculating the network outputs while preserving the influence of the context of each level on image inpainting. The obtained neural network is named LaMa-Wavelet. The FID, PSNR, SSIM indexes and visual analysis were used to estimate the quality of images inpainted with LaMa-Wavelet network.
 Results. The proposed LaMa-Wavelet network has been implemented in software and researched for solving the problem of image inpainting. The PSNR of images inpainted using the LaMa-Wavelet exceeds the results obtained using the LaMa-Fourier network for narrow and medium masks in average by 4.5%, for large masks in average by 6%. The LaMa-Wavelet applying can enhance SSIM by 2–4% depending on a mask size. But it takes 3 times longer to inpaint one image with LaMa-Wavelet than with LaMa-Fourier network. Analysis of specific images demonstrates that both networks show similar results of inpainting of a homogeneous background. On complex backgrounds with repeating elements the LaMa-Wavelet is often more effective in restoring textures.
 Conclusions. The obtained LaMa-Wavelet network allows to improve the image inpainting with large masks due to applying wavelet transform in the LaMa network architecture. Namely, the quality of reconstruction of image edges and fine details is increased.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have