Abstract

Many low-level vision tasks, including guided depth super-resolution (GDSR), struggle with the issue of insufficient paired training data. Self-supervised learning is a promising solution, but it remains challenging to upsample depth maps without the explicit supervision of high-resolution target images. To alleviate this problem, we propose a self-supervised depth super-resolution method with contrastive multiview pre-training. Unlike existing contrastive learning methods for classification or segmentation tasks, our strategy can be applied to regression tasks even when trained on a small-scale dataset and can reduce information redundancy by extracting unique features from the guide. Furthermore, we propose a novel mutual modulation scheme that can effectively compute the local spatial correlation between cross-modal features. Exhaustive experiments demonstrate that our method attains superior performance with respect to state-of-the-art GDSR methods and exhibits good generalization to other modalities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call