Star-Convolution for Image-Based 3D Object Detection

Yuxuan Liu,Ming Liu,Zhenhua Xu

doi:10.1109/icra46639.2022.9811612

Abstract

3D object detection with only image inputs is an interesting and important problem in computer vision and autonomous driving. Nowadays, most existing monocular 3D object detection algorithms rely solely on the approximation power of convolutional neural networks to learn a mapping from pixels to 3D predictions without knowing the projection matrix of the camera. To endow the networks with camera projection knowledge, we propose the Star-Convolution module for application to image-based 3D detection. The introduced module increases the receptive field of the detector and embeds the camera's projection geometry inside the network while keeping the network end-to-end trainable. We test the module with different baselines in both monocular and stereo 3D object detection, and we achieve significant improvements on both tasks. The code will be published at https://github.com/Owen-Liuyuxuan/visualDet3D.

Full Text