Efficient Discrimination and Localization of Multimodal Remote Sensing Images Using CNN-Based Prediction of Localization Uncertainty

Mykhail Uss,Kacem Chehdi,Benoit Vozel,Vladimir Lukin

doi:10.3390/rs12040703

Abstract

Detecting similarities between image patches and measuring their mutual displacement are important parts in the registration of multimodal remote sensing (RS) images. Deep learning approaches advance the discriminative power of learned similarity measures (SM). However, their ability to find the best spatial alignment of the compared patches is often ignored. We propose to unify the patch discrimination and localization problems by assuming that the more accurately two patches can be aligned, the more similar they are. The uncertainty or confidence in the localization of a patch pair serves as a similarity measure of these patches. We train a two-channel patch matching convolutional neural network (CNN), called DLSM, to solve a regression problem with uncertainty. This CNN inputs two multimodal patches, and outputs a prediction of the translation vector between the input patches as well as the uncertainty of this prediction in the form of an error covariance matrix of the translation vector. The proposed patch matching CNN predicts a normal two-dimensional distribution of the translation vector rather than a simple value of it. The determinant of the covariance matrix is used as a measure of uncertainty in the matching of patches and also as a measure of similarity between patches. For training, we used the Siamese architecture with three towers. During training, the input of two towers is the same pair of multimodal patches but shifted by a random translation; the last tower is fed by a pair of dissimilar patches. Experiments performed on a large base of real RS images show that the proposed DLSM has both a higher discriminative power and a more precise localization compared to existing hand-crafted SMs and SMs trained with conventional losses. Unlike existing SMs, DLSM correctly predicts translation error distribution ellipse for different modalities, noise level, isotropic, and anisotropic structures.

Highlights

A popular strategy for solving the image registration problem involves two steps: finding a set of putative correspondences (PC) between patches of registered images and estimating geometrical transform parameters between these images on basis of the found PCs [1,2]
Both the two-stream convolutional neural network (CNN)-based similarity measures (SM) trained with different loss functions and Siamese CNN-based Deep Localization Similarity Measure (DLSM) are compared to five existing multimodal SMs: (1) an SM which includes two terms, the Mutual Information and a gradient term, which highlights the large gradients with orientations in both modalities (GMI, Gradient with Mutual Information) [55]; (2) Scale-Invariant Feature Transform (SIFT)-OCT [15]; (3) Modality Independent Neighborhood Descriptor (MIND) [16]; (4) Histogram of Orientated Phase Congruency (HOPC) [13]; and (5) L2-Net descriptor CNN [27]
We have proposed a new CNN structure for training a multimodal similarity measure that satisfies two properties: a high discriminative power and accurate localization of the compared patches

Summary

Introduction

A popular strategy for solving the image registration problem involves two steps: finding a set of putative correspondences (PC) between patches of registered images and estimating geometrical transform parameters between these images on basis of the found PCs [1,2]. Our contribution to the patch matching problem is a novel convolutional neural network (CNN), called Deep Localization Similarity Measure (DLSM) It is designed for improving both discrimination power and localization accuracy compared to existing hand-crafted and learned SMs. Patch discrimination and localization are not addressed as different problems, but rather as two aspects of the same problem. An important feature, which is lacking in existing SMs, is that the localization accuracy is predicted for each pair of patches, including isotropic and anisotropic textures, patches with low and high SNR, and patches with different modalities in the form of an error covariance matrix This value can be used advantageously to set a proper PC weighting during multimodal image registration [1,2,26].

Overview of Existing Patch Matching CNN Structure and Loss Functions

Discrimination and Localization Ability of Existing Patch Matching CNNs

Requirements to Complexity of Geometrical Transform Between Patches in RS

SM Performance Criteria

Patch Matching as Deep Regression with Uncertainty

Siamese ConvNet Structure and Training Process Settings

Patch Pair Alignment with Subpixel Accuracy

Experimental Part

Multimodal Image Dataset

Discriminative Power Analysis

Method

Patch Matching Uncertainty Analysis

Localization Accuracy Analysis

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote sensing	Publication Date: Feb 20, 2020
Citations: 15	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Efficient Discrimination and Localization of Multimodal Remote Sensing Images Using CNN-Based Prediction of Localization Uncertainty

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote sensing

Lead the way for us

Similar Papers

Time-Frequency distributions of heart sound signals: A Comparative study using convolutional neural networks
Xinqi Bao ... Ernest N Kamavuako
Biomedical Engineering Advances | VOL. 5
Xinqi Bao, et. al.Xinqi Bao ... Ernest N Kamavuako
25 May 2023
Biomedical Engineering Advances | VOL. 5

Patch match networks: Improved two-channel and Siamese networks for image patch matching
Muhammad Shehzad Hanif
Pattern recognition letters | VOL. 120
Muhammad Shehzad HanifMuhammad Shehzad Hanif
08 Jan 2019
Pattern recognition letters | VOL. 120

An object-based and heterogeneous segment filter convolutional neural network for high-resolution remote sensing image classiﬁcation
Xin Pan ... Jun Xu
International Journal of Remote Sensing | VOL. 40
Xin Pan, et. al.Xin Pan ... Jun Xu
27 Feb 2019
International Journal of Remote Sensing | VOL. 40

SAR ATR in the phase history domain using deep convolutional neural networks
Muhammed Burak Alver ... Lorenzo Bruzzone
-
Muhammed Burak Alver, et. al.Muhammed Burak Alver ... Lorenzo Bruzzone
09 Oct 2018
09 Oct 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Discrimination and Localization of Multimodal Remote Sensing Images Using CNN-Based Prediction of Localization Uncertainty

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote sensing