Abstract

As an essential task in remote sensing, disparity estimation of high-resolution stereo images is still confronted with intractable problems due to extremely complex scenes and dynamically changing disparities. Especially in areas containing texture-less regions, repetitive patterns, disparity discontinuities, and occlusions, stereo matching is difficult. Recently, convolutional neural networks have provided a new paradigm for disparity estimation, but it is difficult for current models to consider both accuracy and speed. This paper proposes a novel end-to-end network to overcome the aforementioned obstacles. The proposed network learns stereo matching at dual scales, in which the low one captures coarse-grained information while the high one captures fine-grained information, helpful for matching structures of different scales. Moreiver, we construct cost volumes from negative to positive values to make the network work well for both negative and nonnegative disparities since the disparity varies dramatically in remote sensing stereo images. A 3D encoder-decoder module formed by factorized 3D convolutions is introduced to adaptively learn cost aggregation, which is of high efficiency and able to alleviate the edge-fattening issue at disparity discontinuities and approximate the matching of occlusions. Besides, we use a refinement module that brings in shallow features as guidance to attain high-quality full-resolution disparity maps. The proposed network is compared with several typical models. Experimental results on a challenging dataset demonstrate that our network shows powerful learning and generalization abilities. It achieves convincing performance on both accuracy and efficiency, and improvements of stereo matching in these challenging areas are noteworthy.

Highlights

  • Disparity estimation from a pair of high-resolution remote sensing stereo images is a fundamental yet challenging task

  • Given a pair of rectified stereo images, the goal of disparity estimation is to match corresponding pixels on the left and right images and compute a disparity map referring to the horizontal displacements

  • The large size and huge amount of remote sensing images demand stereo matching algorithms with high efficiency and good generalization. To overcome these obstacles and further promote the performance of end-to-end networks, in this paper, we propose a brand-new convolutional neural networks (CNNs) named dual-scale matching network that mainly focuses on the following improvements

Read more

Summary

Introduction

Disparity estimation from a pair of high-resolution remote sensing stereo images is a fundamental yet challenging task. Given a pair of rectified stereo images, the goal of disparity estimation is to match corresponding pixels on the left and right images and compute a disparity map referring to the horizontal displacements. Traditional algorithms [2,3,4,5] tackle this problem by adopting the classical four-step pipeline, including matching cost computation, cost aggregation, disparity calculation, and disparity refinement [6]. They compute the matching cost within a finite window and adopt hand-crafted schemes for the subsequent steps. Though significant progress has been made, they still have the limitation of dealing with texture-less regions, repeating

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.