Monocular 3D Detection With Geometric Constraint Embedding and Semi-Supervised Training

Peixuan Li,Huaici Zhao

doi:10.1109/lra.2021.3061343

Abstract

In this work, we propose a novel one-stage and keypoint-based framework for monocular 3D object detection using only RGB images, called KM3D-Net. 2D detection only requires a deep neural network to predict 2D properties of objects, as it is a semanticity-aware task. For image-based 3D detection, we argue that the combination of a deep neural network and geometric constraints are needed to synergistically estimate appearance-related and spatial-related information. Here, we design a fully convolutional model to predict object keypoints, dimension, and orientation, and combine these with perspective geometry constraints to compute position attributes. Further, we reformulate the geometric constraints as a differentiable version and embed this in the network to reduce running time while maintaining the consistency of model outputs in an end-to-end fashion. Benefiting from this simple structure, we propose an effective semi-supervised training strategy for settings where labeled training data are scarce. In this strategy, we enforce a consensus prediction of two shared-weights KM3D-Net for the same unlabeled image under different input augmentation conditions and network regularization. In particular, we unify the coordinate-dependent augmentations as the affine transformation for the differential recovering position of objects, and propose a keypoint-dropout module for network regularization. Our model only requires RGB images, without synthetic data, instance segmentation, CAD model, or depth generator. Extensive experiments on the popular KITTI 3D detection dataset indicate that the KM3D-Net surpasses state-of-the-art methods by a large margin in both efficiency and accuracy. And also, to the best of our knowledge, this is the first application of semi-supervised learning in monocular 3D object detection. We surpass most of the previous fully supervised methods with only 13% labeled data on KITTI.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Monocular 3D Detection With Geometric Constraint Embedding and Semi-Supervised Training

Abstract

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters

Lead the way for us

Journal: IEEE Robotics and Automation Letters	Publication Date: Jul 1, 2021
Citations: 48

Similar Papers

MonoDCN: Monocular 3D object detection based on dynamic convolution.
Shenming Qu ... Xinyu Yang
PLOS ONE | VOL. 17
Shenming Qu, et. al.Shenming Qu ... Xinyu Yang
04 Oct 2022
PLOS ONE | VOL. 17

Kinematic 3D Object Detection in Monocular Video
Garrick Brazil ... Xiaoming Liu
-
Garrick Brazil, et. al.Garrick Brazil ... Xiaoming Liu
01 Jan 2020
01 Jan 2020

Depth-enhancement network for monocular 3D object detection
Guohua Liu ... Changrui Guo
Measurement Science and Technology | VOL. 35
Guohua Liu, et. al.Guohua Liu ... Changrui Guo
05 Jun 2024
Measurement Science and Technology | VOL. 35

Pseudo-Mono for Monocular 3D Object Detection in Autonomous Driving
Chongben Tao ... Jiecheng Cao
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33
Chongben Tao, et. al.Chongben Tao ... Jiecheng Cao
01 Aug 2023
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Monocular 3D Detection With Geometric Constraint Embedding and Semi-Supervised Training

Abstract

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters