Abstract
Stereo matching depth estimation for rectified image pairs is of great importance to many compute vision tasks, specifically in autonomous driving. With the flourishing of convolution neural networks, responsible depth estimation of stereo matching with artificial intelligence is the most severe challenge for autonomous driving in recent years. Previous research on end-to-end trainable stereo matching networks has usually used cascading convolution blocks with down-sampling or pooling operations to extract the unary features required for matching cost construction. Such approaches lack a reconstruction stage for increasing feature map pixel-wise alignment and strength, factors which play an important role in representing the similarity between stereo image pairs. To address this issue, in this paper, we propose the progressive fusion stereo matching network (PFSM-Net). We exploit an encoder-decoder feature extraction network architecture for multi-stage and -scale dynamic feature extraction. Moreover, we propose a group-wise concatenation method to construct the cost volume, which provides a more efficient cost volume for cost aggregation. Furthermore, we propose the use of multi-scale cost aggregation networks with a progressive fusion strategy. The aggregated cost volume is progressively fused with the multi-stage and -scale cost volume as the size of the cost volume increases. Multi-stage and -scale outputs are supervised with and learned in a coarse-to-fine manner. Experimental results demonstrate that our method outperforms previous methods on the SceneFlow, KITTI 2012, and KITTI 2015 datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Intelligent Transportation Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.