Abstract

Point cloud based 3D objection plays a crucial role in real-world applications, such as autonomous driving. In this paper, we propose the Multi-view Semantic Learning Network (MVSLN) for 3D object detection, an approach considering the feature discrimination for LIDAR point cloud. Since the discrete and disordered nature of point cloud, most existing methods ignore the low-level information and focus more on the spatial details of point cloud. To capture the discriminative feature of objects, our MVSLN takes advantages of both spatial and low-level details to further exploit semantic information. Specifically, the Multiple Views Generator (MVG) module in our approach observes the scene from four views by projecting the 3D point cloud to planes with specific angles, which preserves much more low-level features, e.g., texture and edge. To correct the deviation brought by different projection angles, the Spatial Recalibration Fusion (SRF) operation in our approach adjusts the locations of features of these four views, enabling the interaction between different projections. Then the recalibrated features of SRF are sent to the developed 3D Region Proposal Network (RPN) to detect objects. The experimental results on challenging KITTI benchmark verify that our approach achieves a promising performance and outperforms state-of-the-art methods. Furthermore, the discriminative feature extractor brought by exploiting the conspicuous semantic information, leads to encouraging results in the hard-level difficulty of both BEV and 3D object detection tasks, without any help of camera image.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call