Abstract

We study the colorization problem in monochrome-color dual-lens camera systems, i.e. colorizing the gray image from the monochrome camera using the color image from the color camera as reference. In related methods, cost volume based CNN methods achieve the state-of-the-art results, but they are costly in GPU memory due to building the 4D cost volume. Recently, some slice-wise cross-attention based methods are proposed for related problems. The slice-wise cross-attention has much less costs in GPU memory but directly using them for this colorization problem cannot generate competing results. We make use of the non-local computation property of cross-attention to propose a transformer based method. To overcome the limitations of straight-forward slice-wise cross-attention, we propose the spatially consistent cross-attention (SCCA) block to encourage pixels of slices across different epipolar lines in the gray image to find spatially consistent correspondence with pixels of the reference color image. And, to further reduce the memory cost while keeping the colorization accuracy, we design a pyramid processing strategy to cascade a series of SCCA blocks with smaller slice size and perform the colorization from coarse to fine. To extract more powerful image features, we use several regional self-attention (RSA) blocks with U-style connections. Experimental results show that we outperform the state-of-the-art methods largely on the synthesized datasets of Cityscapes, Sintel, and SceneFlow, and the real monochrome-color dual-lens dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.