CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection

Yuanzhouhan Cao,Hui Zhang,Chao Ren,Congyan Lang,Yidong Li

doi:10.1109/tits.2022.3205446

Abstract

The key to 3D object detection is proper utilization of depth data. Compared with LiDAR based approaches, 3D object detection from a single image remains a challenging task due to the lack of structure information. Recent methods leverage monocular depth estimation as a way to produce 2D depth maps, and adopt the depth maps as additional source of input to explore structure information. However, these methods either encode local structure correlations, or encode long range structure correlations by iteratively passing local messages. In this work, we propose a cross modal attention network (CMAN) for monocular 3D object detection. It is built upon the self-attention module which learns attention map from single modal data. Our CMAN is able to encode structure correlations from depth data, and embed the structure correlations with appearance information which is learned from RGB data. Thanks to the attention learning mechanism, our CMAN learns global structure correlations without iteration. In order to reduce the computational burden, our CMAN adopts a novel node sampler to eliminate redundant nodes during the attention map calculation. Experiment results on benchmark KITTI3D dataset show that our proposed CMAN outperforms the state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Intelligent Transportation Systems

Lead the way for us

Journal: IEEE Transactions on Intelligent Transportation Systems	Publication Date: Dec 1, 2022
Citations: 3

Similar Papers

Camera and LiDAR-based point painted voxel region-based convolutional neural network for robust 3D object detection
Han Xie ... Wenqi Zheng
Journal of Electronic Imaging | VOL. 31
Han Xie, et. al.Han Xie ... Wenqi Zheng
06 Oct 2022
Journal of Electronic Imaging | VOL. 31

An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting
Ming Yan ... Zhongtong Li
IEEE Access | VOL. 8
Ming Yan, et. al.Ming Yan ... Zhongtong Li
01 Jan 2020
IEEE Access | VOL. 8

SMS-Net: Sparse multi-scale voxel feature aggregation network for LiDAR-based 3D object detection
Sheng Liu ... Shengyong Chen
Neurocomputing | VOL. 501
Sheng Liu, et. al.Sheng Liu ... Shengyong Chen
21 Jun 2022
Neurocomputing | VOL. 501

Efficient flexible voxel-based two-stage network for 3D object detection in autonomous driving
Fanyue Sun ... Yan Song
Applied Soft Computing | VOL. 162
Fanyue Sun, et. al.Fanyue Sun ... Yan Song
12 Jun 2024
Applied Soft Computing | VOL. 162

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Intelligent Transportation Systems