N-QGNv2: Predicting the optimum quadtree representation of a depth map from a monocular camera

Daniel Braun,Olivier Morel,Cédric Demonceaux,Pascal Vasseur

doi:10.1016/j.patrec.2024.01.027

Abstract

Self-supervised monocular depth prediction is a widely researched field that aims to provide a better scene understanding. However, most existing methods prioritize prediction accuracy over computation cost, which can hinder the deployment of these methods in real-world applications. Our objective is to propose a solution that efficiently compresses the depth map while maintaining a high level of accuracy for navigation purpose. The proposed method is an expansion of the work presented in N-QGN, which utilizes a quadtree representation for compression. This approach has already shown promising results, but we aim to improve it further by making it more accurate, faster, and easier to train. Therefore, we introduce a new method that directly predicts the quadtree structure, resulting in a more consistent prediction, and we revise the network architecture to be lighter and produce state-of-the-art accuracy results, depending on the data compression rate. The new implementation is also faster, making it more suitable for real-time applications. Experiments have been conducted on various scene configuration highlighting the capability of the method to efficiently predicting a reliable quadtree depth representation of the scene at low computation cost and high accuracy.

Full Text