Abstract

Light Detection and Ranging (LiDAR) is becoming a critical requirement for future computer vision applications, such as AR/VR (iPhone-LiDAR) and ADAS (Automotive-LiDAR). A depth point-cloud input has different characteristics than a conventional RGB image input, such that the CNN depthinference implementation is unique when compared with a standard super-resolution CNN(SR-CNN). In this brief, we present a heterogeneous AI-accelerator SoC, which is specific to depth image completion computation. Three key innovations are introduced to improve SoC’s performance. First, to accommodate the unique input data structure of a depth input, a fully-filled dataflow management engine is proposed to pre-process the RGB+Depth input, significantly improving processing element utilization (PEU). Second, to improve the efficiency of the instruction configurations of the CNN accelerator, a hardware-tiling co-processor is proposed that performs the tiling strategy of the CNN accelerator, assigning each sub-job to the PE array directly, therefore reducing the time for task assignments. Third, due to the large number of vector operations required for the postprocess in the neural network, a RISC-V core is incorporated to execute vector computations better. The SoC is implemented in 40nm CMOS process, achieving 2TOPs/W energy efficiency with 34fps throughput under VGA-resolution output for realtime LiDAR systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call