PIPformers: Patch based inpainting with vision transformers for generalize paintings

Jeyoung Lee,Hochul Kang

doi:10.1002/cav.2270

Abstract

AbstractImage inpainting is a field that has been traditionally attempted in the field of computer vision. After the development of deep learning, image inpainting has been advancing endlessly together with convolutional neural networks and generative adversarial networks. Thereafter, it has been expanded to various fields such as image filing through guiding and image inpainting using various masking. Furthermore, the field termed image out‐painting has also been pioneered. Meanwhile, after the recent announcement of the vision transformer, various computer vision problems have been attempted using the vision transformer. In this paper, we are trying to solve the problem of image generalization painting using the vision transformer. This is an attempt to fill images with painting no matter whether the areas where painting is missing are in or out of the images, and without guiding. To that end, the painting problem was defined as a problem to drop images in patch units for easy use in the vision transformer. And we solved the problem with a simple network structure created by slightly modifying the vision transformer to fit the problem. We named this network PIPformers. PIPformers achieved better values than other papers compared to PSNR, RMSE and SSIM.

Full Text