Abstract
Stereo image super-resolution exploits additional features from cross view image pairs for high resolution (HR) image reconstruction. Recently, several new methods have been proposed to investigate cross view features along epipolar lines to enhance the visual perception of recovered HR images. Despite the impressive performance of these methods, global contextual features from cross view images are left unexplored. In this paper, we propose a cross view capture network (CVCnet) for stereo image super-resolution by using both global contextual and local features extracted from both views. Specifically, we design a cross view block to capture diverse feature embeddings from the views in stereo vision. In addition, a cascaded spatial perception module is proposed to redistribute each location in feature maps according to the weight it occupies to make the extraction of features more effective. Extensive experiments demonstrate that our proposed CVCnet outperforms the state-of-the-art image super-resolution methods to achieve the best performance for stereo image super-resolution tasks. The source code is available at https://github.com/xyzhu1/CVCnet.
Highlights
S TEREO image super-resolution (SR) is attracting considerable attention due to its great value in 3D applications
Our cross view capture network (CVCnet) is composed of three integral components, i.e., initial feature extraction (IFE), cross view block (CVB) and spatial perception module (SPM)
To further prove that the improved performance is not caused by adding more parameters, we present a model variant by adding a convolution layer that has the same number of increased parameters as when including the CVB
Summary
S TEREO image super-resolution (SR) is attracting considerable attention due to its great value in 3D applications. There has been a high demand for 3D contents with finer resolution [1]–[6] since the rise of immersive technologies, including high resolution (HR) 3DTV, augmented reality (AR), and virtual reality (VR). Limitations on stereo imaging capture devices, e.g., the use of dual cameras on mobile phones, may produce low resolution (LR) stereo image pairs. Narrow network bandwidth restricts the transmission of high resolution stereo images. Stereo image SR aims to generate HR stereo image pairs from their low resolution counterparts to significantly enhance their visual perception. This technique has great potential in improving the user experience when deploying immersive services
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have