Abstract

We study the problem of estimating the relative depth order of point pairs in a monocular image. Recent advances mainly focus on using deep convolutional neural networks to learn and infer the ordinal information from multiple contextual information of the point pairs, such as global scene context, local contextual information, and the locations. However, it remains unclear how much each context contributes to the task. To address this, we first examine the contribution of each context cue to the performance in the context of depth order estimation. We find out that the local context surrounding the point pairs contributes the most, and the global scene context helps little. Based on the findings, we propose a simple method, using a multi-scale densely-connected network to tackle the task. Instead of learning the global structure, we dedicate to explore the local structure by learning to regress from the regions of multiple sizes around the point pairs. Moreover, we use the recent densely connected network to encourage the substantial feature reuse as well as deepen our network to boost the performance. We show in experiments that the results of our approach are on par with or better than the state-of-the-art methods with the benefit of using only a small number of training data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.