Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection

Xianpeng Liu,Nan Xue,Tianfu Wu

doi:10.1609/aaai.v36i2.20074

Abstract

Monocular 3D object detection aims to localize 3D bounding boxes in an input single 2D image. It is a highly challenging problem and remains open, especially when no extra information (e.g., depth, lidar and/or multi-frames) can be leveraged in training and/or inference. This paper proposes a simple yet effective formulation for monocular 3D object detection without exploiting any extra information. It presents the MonoCon method which learns Monocular Contexts, as auxiliary tasks in training, to help monocular 3D object detection. The key idea is that with the annotated 3D bounding boxes of objects in an image, there is a rich set of well-posed projected 2D supervision signals available in training, such as the projected corner keypoints and their associated offset vectors with respect to the center of 2D bounding box, which should be exploited as auxiliary tasks in training. The proposed MonoCon is motivated by the Cramer–Wold theorem in measure theory at a high level. In implementation, it utilizes a very simple end-to-end design to justify the effectiveness of learning auxiliary monocular contexts, which consists of three components: a Deep Neural Network (DNN) based feature backbone, a number of regression head branches for learning the essential parameters used in the 3D bounding box prediction, and a number of regression head branches for learning auxiliary contexts. After training, the auxiliary context regression branches are discarded for better inference efficiency. In experiments, the proposed MonoCon is tested in the KITTI benchmark (car, pedestrian and cyclist). It outperforms all prior arts in the leaderboard on the car category and obtains comparable performance on pedestrian and cyclist in terms of accuracy. Thanks to the simple design, the proposed MonoCon method obtains the fastest inference speed with 38.7 fps in comparisons. Our code is released at https://git.io/MonoCon.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 48

Similar Papers

Monocular 3D Multi-Object Tracking with an EKF Approach for Long-Term Stable Tracks
Andreas Reich ... Hans-Joachim Wuensche
-
Andreas Reich, et. al.Andreas Reich ... Hans-Joachim Wuensche
01 Nov 2021
01 Nov 2021

Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud
Xinshuo Weng ... Kris Kitani
-
Xinshuo Weng, et. al.Xinshuo Weng ... Kris Kitani
01 Oct 2019
01 Oct 2019

Semi-supervised Monocular 3D Object Detection by Multi-view Consistency
Qing Lian ... Yingcong Chen
-
Qing Lian, et. al.Qing Lian ... Yingcong Chen
01 Jan 2021
01 Jan 2021

Monocular 3D object detection using dual quadric for autonomous driving
Peixuan Li ... Huaici Zhao
Neurocomputing | VOL. 441
Peixuan Li, et. al.Peixuan Li ... Huaici Zhao
16 Feb 2021
Neurocomputing | VOL. 441

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence