Abstract

As an important type of 3D representation, the point cloud is widely used in many applications, such as autonomous driving, AR/VR, and intelligent robots, which require real-time interactions with humans. However, the sparsity of 3D point cloud data leads to severe computational inefficiency when being processed by 2D data processors, posing a huge challenge for hardware acceleration. In this paper, we aim at solving the inefficiency problem by algorithm-hardware co-optimization. Firstly, a lightweight network, named LPN, is proposed for point cloud data classification, which is 30× smaller than pointnet and still has comparable accuracy. Secondly, a reconfigurable computing core, named RCC, together with an adaptive dataflow, is developed to support different layers of the LPN. Specifically, to accelerate memory-intensive layers, a partially-parallel computing scheme is introduced to minimize the on-chip memory requirements and DRAM accesses. Finally, based on the above innovations, a low-latency accelerator is proposed to realize real-time computation for the point cloud, which is implemented on the Xilinx Kintex UltraScale KCU150 FPGA board. Experimental results show that it achieves 1.5× throughput improvement compared with the state-of-the-art works, and 35× speedup over Intel Xeon Gold 6148 CPU, demonstrating the superiority of the proposed method. The code of LPN is available from https://github.com/snowsil/LPN-model-for-3D-classification.git.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call