Abstract

3D object detection with only image inputs is an interesting and important problem in computer vision and autonomous driving. Nowadays, most existing monocular 3D object detection algorithms rely solely on the approximation power of convolutional neural networks to learn a mapping from pixels to 3D predictions without knowing the projection matrix of the camera. To endow the networks with camera projection knowledge, we propose the Star-Convolution module for application to image-based 3D detection. The introduced module increases the receptive field of the detector and embeds the camera's projection geometry inside the network while keeping the network end-to-end trainable. We test the module with different baselines in both monocular and stereo 3D object detection, and we achieve significant improvements on both tasks. The code will be published at https://github.com/Owen-Liuyuxuan/visualDet3D.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call