Stereo Dense Image Matching by Adaptive Fusion of Multiple-Window Matching Results

Yilong Han,Shugen Wang,Rongjun Qin,Xu Huang,Wei Liu

doi:10.3390/rs12193138

Abstract

Traditional stereo dense image matching (DIM) methods normally predefine a fixed window to compute matching cost, while their performances are limited by the matching window sizes. A large matching window usually achieves robust matching results in weak-textured regions, while it may cause over-smoothness problems in disparity jumps and fine structures. A small window can recover sharp boundaries and fine structures, while it contains high matching uncertainties in weak-textured regions. To address the issue above, we respectively compute matching results with different matching window sizes and then proposes an adaptive fusion method of these matching results so that a better matching result can be generated. The core algorithm designs a Convolutional Neural Network (CNN) to predict the probabilities of large and small windows for each pixel and then refines these probabilities by imposing a global energy function. A compromised solution of the global energy function is utilized by breaking the optimization into sub-optimizations of each pixel in one-dimensional (1D) paths. Finally, the matching results of large and small windows are fused by taking the refined probabilities as weights for more accurate matching. We test our method on aerial image datasets, satellite image datasets, and Middlebury benchmark with different matching cost metrics. Experiments show that our proposed adaptive fusion of multiple-window matching results method has a good transferability across different datasets and outperforms the small windows, the median windows, the large windows, and some state-of-the-art matching window selection methods.

Highlights

The goal of stereo dense image matching (DIM) is to find pixel-wise correspondences between stereo image pairs, which has been attracting increased attention in photogrammetry and computer vision communities for decades [1,2]
We tested our proposed method on the above three datasets with three different matching cost metrics, Census [17], ZNCC [35], and MC-Convolutional Neural Network (CNN)-fst [22], and compared it with the matching results of a 5 × 5 pixels matching window, 15 × 15 pixels matching window, 9 × 9 pixels matching window as well as a recent texture-based window selection method [25], which adaptively selected appropriate window sizes according to local intensity variations, a matching confidence-based method which selects matching windows with the least matching uncertainties [31], and our previous window size selection network (WSSN) [32], which extracts both image texture features and disparity features by convolutional neural network and utilizes the fully connected layers to conduct optimal window size selection
The matching results of all methods have been optimized by Semi-Global Matching (SGM) and several post-processing steps (e.g., Winner-Takes-All (WTA), Left-Right Consistency (LRC), disparity interpolation) with the same matching parameters

Summary

Introduction

The goal of stereo dense image matching (DIM) is to find pixel-wise correspondences between stereo image pairs, which has been attracting increased attention in photogrammetry and computer vision communities for decades [1,2]. The image pairs are generally rectified in the epipolar image space such that correspondences are in the same row between the pairs with only differences in the column coordinates, termed disparity or parallax. Traditional DIM methods search correspondences by comparing the similarities of their appearances (e.g., intensities, textures). Most DIM methods [10] predefine a fixed window to describe the appearance features of correspondences, and compare the appearance similarities by measuring distances of these features, termed as matching cost. 2p02r0o, 1p2,oxsFeOdR PiEnERthREeVIlEaWst decades, and the difference among them main2lyof 2f0ocused on the various appearance feature descriptors, e.g., image intensities, image gradients, and intensity rankings. ImaTgraediitniotneanlsDitIyM-bamseethdodms astecahrcihngcocrroesstpomndeetnrcicess [b1y,1c1o,m12p]arainsgsuthme esibmriilgarhittinesesosf cthoenirstancy for correspondathpeepnaecpaerpasenacareansnd(cee.gcf.eo,aimtnutrepenussiotteifecsmo, rteraextstcuphroenisnd).geMnccoeossts, DtanIbMdycmocmeothpmoardpesat[h1re0in]apgprpeiednaertfaeinnnecseaistfiiimxeesidlaiwrnitiinemdsobawytctmhoeidanesgsucrriwinbgeindows of corresponddeisntacnecse.s Sofutchhesme feeathtuoredss, aalsroe taelrwmeadysasemffiactciheinntg aconsdt. sVtarraioigushtwfoinrdwowar-bda,sebdumt saetcnhsinitgivmeettroicsnoises and image radhioavme ebtereinc dpriosptosretdioinnst.heImlaastgdeegcardaeds,ieandt-bthaeseddiffemreantccehaimnogncgotshtemmemtariinclsy cfocmuspedutoengtrhaedients for each pixelv, aarnioduseiatphpeeraruanscee thfeeatsueregrdaedscireipnttosrst,hee.gm., siemlvagees ionrteunssietietsh, eimdaigsetrgibraudtiieonntss, oanfdthinetseensgitryadients as feature desrccaonrrrikepisntpgoosrn.sdIm[e1na3cge–es1i6ann]t.ednSscuiotmyc-hpbaumsteeedtmhmaotacdthcshinicngagncoccsotostbmmypecetornmicsspaa[tr1ei,n1f1go,1ri2nl]tienansessiautirmesreaibndriimoghmattncehetsirnsigccowdniisnsttadonorcwytisofoonrfs between correspondcoernrecsepsonadnednceasc. hSuiechvemertohboudsstarme aalwtcahyisnegffircieesnut altnsdisntrateigxhttuforrewdarrde,gbuiot nsesn.siItnivteetnosniotyisersaannkding based matching ciomsatg[e1r7a–d2io0m] ertarinckdsisitnotretinonssi.tiIemsawgeitghriandiaenmt-baatscehdinmgatwchingdocowst bmyetcroicms cpoamrpinutge tghraedsieenintstefonrsities with the centraleapcihxpeilxaeln, dancdoenitshiedreursse tthheeseragnrakdiinengtsrethseumltsselvaesstohreufseeathtue rdeisdtreibsuctrioipnstoorfst.heAsemgorandgienstuscahs methods, Census [17fce,o2art1rue]rsempodaneydscerbnipceetosthrasen[d1m3a–oc1h6si]te. vcSeoucmrhobmmueostthnmolydastucchasinendgcormmespauetlntcsshaiintnetgfeoxrctuloirnseetdaarrnreagddioihonmas.seItnbrtieceendnsiisttpyorrrtoaionvnkesinnbgettobwabeseeednone of the most robusmtawtchhienng ccoosmt [1p7a–r2e0d] rwaniktshinottehnesirtitersawdiitthiionnaaml matcehtihnogdwsin[d1o].w by comparing these intensities

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: Sep 24, 2020
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Stereo Dense Image Matching by Adaptive Fusion of Multiple-Window Matching Results

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

A window size selection network for stereo dense image matching
Xu Huang ... Rongjun Qin
International Journal of Remote Sensing | VOL. 41
Xu Huang, et. al.Xu Huang ... Rongjun Qin
02 Mar 2020
International Journal of Remote Sensing | VOL. 41

<title>Design of large-window binary filters via iteration</title>
Nina S T Hirata ... Edward R Dougherty
-
Nina S T Hirata, et. al.Nina S T Hirata ... Edward R Dougherty
06 Oct 1998
06 Oct 1998

An improved semi-global matching method with optimized matching aggregation constraint
Xu Huang ... Yilong Han
IOP Conference Series: Earth and Environmental Science | VOL. 569
Xu Huang, et. al.Xu Huang ... Yilong Han
01 Sep 2020
IOP Conference Series: Earth and Environmental Science | VOL. 569

Look Wider to Match Image Patches With Convolutional Neural Networks
Haesol Park ... Kyoung Mu Lee
IEEE Signal Processing Letters | VOL. 24
Haesol Park, et. al.Haesol Park ... Kyoung Mu Lee
01 Dec 2017
IEEE Signal Processing Letters | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stereo Dense Image Matching by Adaptive Fusion of Multiple-Window Matching Results

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing