Abstract

The state-of-the-art stereo matching models trained on synthetic datasets have difficulty in generalizing to real-world datasets. One major reason is that illumination and texture in the real world are hard to be simulated, resulting in big differences between synthetic and real-world data. In this study, instead of narrowing the image-level appearance difference, we focus on aligning both data domains in feature space in an unsupervised manner and propose an end-to-end domain alignment stereo network (DAStereo). A domain alignment module (DAM) is introduced by learning a point-wise linear transformation. We demonstrate that DAM can maintain sufficient alignment capacity with fewer parameters than the globally nonlinear mapping. To explicitly promote the point-wise domain alignment, adversarial learning is further introduced using a cost volume discriminator in a hybrid training manner. Experimental results show that DAStereo outperforms the state-of-the-art unsupervised and adaptive methods and even achieves comparable performance to some supervised methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.