Abstract
Registration for multisensor or multimodal image pairs with a large degree of distortions is a fundamental task for many remote sensing applications. To achieve accurate and low-cost remote sensing image registration, we propose a multiscale framework with unsupervised learning, named MU-Net. Without costly ground truth labels, MU-Net directly learns the end-to-end mapping from the image pairs to their transformation parameters. MU-Net stacks several deep neural network (DNN) models on multiple scales to generate a coarse-to-fine registration pipeline, which prevents the backpropagation from falling into a local extremum and resists significant image distortions. We design a novel loss function paradigm based on structural similarity, which makes MU-Net suitable for various types of multimodal images. MU-Net is compared with traditional feature-based and area-based methods, as well as supervised and other unsupervised learning methods on the optical-optical, optical-infrared, optical-synthetic aperture radar (SAR), and optical-map datasets. Experimental results show that MU-Net achieves more comprehensive and accurate registration performance between these image pairs with geometric and radiometric distortions. We share the code implemented by Pytorch at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/yeyuanxin110/MU-Net</uri> .
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Geoscience and Remote Sensing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.