Abstract

A 3-stages deep neural network (DNN) based camera and lidar fusion framework for 3D objects recognition is proposed in this paper. First, to leverage the high resolution of camera and 3D spatial information of Lidar, region proposal network (RPN) is trained to generate proposals from RGB image feature maps and bird-view (BV) feature maps, these proposals are then lifted into 3D proposals. Then, a segmentation network is used to extract object points directly from points inside these 3D proposals. At last, 3D object bounding box instances are extracted from the interested object points by an estimation network followed after a translation by a light-weight TNet, which is a special supervised spatial transformer network (STN). Experiment results show that this proposed 3d object recognition framework can produce considerable result as the other leading methods on KITTI 3D object detection datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call