Bridging Global Context Interactions for High-Fidelity Pluralistic Image Completion.

Chuanxia Zheng,Guoxian Song,Tat-Jen Cham,Jianfei Cai,Linjie Luo,Dinh Phung

doi:10.1109/tpami.2024.3403695

Abstract

We introduce PICFormer, a novel framework for Pluralistic Image Completion using a transFormer based architecture, that achieves both high quality and diversity at a much faster inference speed. Our key contribution is to introduce a code-shared codebook learning using a restrictive CNN on small and non-overlapping receptive fields (RFs) for the local visible token representation. This results in a compact yet expressive discrete representation, facilitating efficient modeling of global visible context relations by the transformer. Unlike the prevailing autoregressive approaches, we proposed to sample all tokens simultaneously, leading to more than 100× faster inference speed. To enhance appearance consistency between visible and generated regions, we further propose a novel attention-aware layer (AAL), designed to better exploit distantly related high-frequency features. Through extensive experiments, we demonstrate that the efficiently learns semantically-rich discrete codes, resulting in significantly improved image quality. Moreover, our diverse image completion framework surpasses state-of-the-art methods on multiple image completion datasets. The project page is available at https://chuanxiaz.com/picformer/.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bridging Global Context Interactions for High-Fidelity Pluralistic Image Completion.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Jan 1, 2024
Citations: 1

Similar Papers

Dynamic and distributed properties of many-neuron ensembles in the ventral posterior medial thalamus of awake rats.
M A Nicolelis ... J K Chapin
Proceedings of the National Academy of Sciences | VOL. 90
M A Nicolelis, et. al.M A Nicolelis ... J K Chapin
15 Mar 1993
Proceedings of the National Academy of Sciences | VOL. 90

Encoding shape and spatial relations: The role of receptive field size in coordinating complementary representations
Robert A Jacobs
Cognitive Science | VOL. 18
Robert A JacobsRobert A Jacobs
01 Sep 1994
Cognitive Science | VOL. 18

Primary somatosensory cortex modulation of tactile responses in nucleus gracilis cells of rats.
Eduardo Malmierca ... Angel Nunez
The European journal of neuroscience | VOL. 19
Eduardo Malmierca, et. al.Eduardo Malmierca ... Angel Nunez
01 Mar 2004
The European journal of neuroscience | VOL. 19

Does Neuronal Synchrony Underlie Visual Feature Grouping?
Ben J.A Palanca ... Gregory C Deangelis
Neuron | VOL. 46
Ben J.A Palanca, et. al.Ben J.A Palanca ... Gregory C Deangelis
01 Apr 2005
Neuron | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bridging Global Context Interactions for High-Fidelity Pluralistic Image Completion.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence