An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting

Ming Yan,Xinyan Yu,Cong Jin,Zhongtong Li

doi:10.1109/access.2020.3012695

Ming Yan, Xinyan Yu + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.3012695

Copy DOI

Abstract

Existing outdoor three-dimensional (3D) object detection algorithms mainly use a single type of sensor, for example, only using a monocular camera or radar point cloud. However, camera sensors are affected by light and lose depth information. When scanning a distant object or an occluded object, the data collected by the short-range radar point cloud sensor are very sparse, which affects the detection algorithm. To address the above challenges, we design a deep learning network that can combine the texture information of two-dimensional (2D) data and the geometric information of 3D data for object detection. To solve the problem of a single sensor, we use a reverse mapping layer and an aggregation layer to combine the texture information of RGB data with the geometric information of point cloud data and design a maximum pooling layer to deal with the input of multi-view cameras. In addition, to solve the defects of the 3D object detection algorithm based on the region proposal network (RPN) method, we use the Hough voting algorithm implemented by a deep neural network to suggest objects. Experimental results show that our algorithm has a 1.06% decrease in average precision (AP) compared to PointRCNN in easy car object detection, but our algorithm requires 37.7% less time to calculate than PointRCNN under the same hardware environment. Moreover, our algorithm improves the AP by 1.14% compared to PointRCNN in hard car object detection.

Highlights

In recent years, robotics and autonomous driving technologies have developed rapidly
Compared with the region proposal network (RPN) method commonly used in object detection algorithms, the Hough voting method is more suitable for 3D object detection
5) VISUALIZED RESULTS To verify the performance of the object detection algorithm proposed in this paper, scenario No 1925 in the KITTI object detection dataset is selected for testing

Summary

INTRODUCTION

Robotics and autonomous driving technologies have developed rapidly. Designing a set of algorithms that can simultaneously use the RGB information of the image and the geometric information of the point cloud is a key measure to improve the stability and accuracy of the detection algorithm. We propose a new end-to-end deep learning network for 3D object detection and recognition This method can input RGB and point cloud data at the same time and combine texture information and geometric information in the two types of data. We design a novel deep learning network architecture that can combine the texture features of 2D image data and the geometric features of 3D data to increase the performance of algorithm detection in complex environments. Compared with the region proposal network (RPN) method commonly used in object detection algorithms, the Hough voting method is more suitable for 3D object detection

RELATED WORK

HOUGH VOTING NETWORK

OBJECT CLASSIFICATION AND REGRESSION

LOSS FUNCTION

CONCLUSION