Abstract

Recent multi-view stereo (MVS) works benefit a lot from deep neural networks with their great representation ability. However, top-performing networks require accurate depth ground truth rendered from scanned 3D models for training, which is quite challenging to capture in real-world scenarios. Thus, dealing with label noise contained in the dense depth labels is necessary for training when applied in complex real-world conditions, which has not been addressed before. In this paper, we investigate the effect of label noise in the ground truth depth maps by append noisy labels to the given accurate labels, and find that MVS networks are susceptible to noisy training data. To tackle this problem, we proposed a novel label noise-robust training framework for MVS networks. First, we propose a co-teaching based noise-robust training method to filter label noise contained in the dense depth ground truth, which trains two parallel network branches simultaneously and exchanges the noise selection capability to each other. Then, we present a co-training scheme that pseudo labels, the predicted depth maps that passes the geometric consistency check, are utilized as supervision to the peer network. Finally, the photometric consistency loss is introduced as additional supervision for all pixels. Experimental results demonstrate the robustness of deep MVS networks trained by our algorithm under low-level noise rate (40%) and high-level noise rate (65%), and the comparable 3D reconstruction performance to the network trained with label noise free datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call