Vehicle detection and localization on bird's eye view elevation images using convolutional neural network

Ryunosuke Hamada,Satoshi Tadokoro,Thomas Westfechtel,Shang-Lin Yu,Kazunori Ohno

doi:10.1109/ssrr.2017.8088147

Abstract

For autonomous vehicles, the ability to detect and localize surrounding vehicles is critical. It is fundamental for further processing steps like collision avoidance or path planning. This paper introduces a convolutional neural network- based vehicle detection and localization method using point cloud data acquired by a LIDAR sensor. Acquired point clouds are transformed into bird's eye view elevation images, where each pixel represents a grid cell of the horizontal x-y plane. We intentionally encode each pixel using three channels, namely the maximal, median and minimal height value of all points within the respective grid. A major advantage of this three channel representation is that it allows us to utilize common RGB image-based detection networks without modification. The bird's eye view elevation images are processed by a two stage detector. Due to the nature of the bird's eye view, each pixel of the image represent ground coordinates, meaning that the bounding box of detected vehicles correspond directly to the horizontal position of the vehicles. Therefore, in contrast to RGB-based detectors, we not just detect the vehicles, but simultaneously localize them in ground coordinates. To evaluate the accuracy of our method and the usefulness for further high-level applications like path planning, we evaluate the detection results based on the localization error in ground coordinates. Our proposed method achieves an average precision of 87.9% for an intersection over union (IoU) value of 0.5. In addition, 75% of the detected cars are localized with an absolute positioning error of below 0.2m.

Full Text