The hardware architecture of a deep-neural-network-based feature point extraction method is proposed for the simultaneous localization and mapping (SLAM) in robotic applications, which is named the Feature Point based SLAM Network (FPSNET). Some key techniques are deployed to improve the hardware and power efficiency. The data path is devised to reduce overall off-chip memory accesses. The intermediate data and partial sums resulting in the convolution process are stored in available on-chip memories, and optimized hardware is employed to compute the one-point activation function. Meanwhile, address generation units are used to avoid data overlapping in memories. The proposed FPSNET has been designed in 65 nm CMOS technology with a core area of 8.3 mm2. This work reduces the memory overhead by 50% compared to traditional data storage for activation and overall by 35% for on-chip memories. The synthesis and simulation results show that it achieved a 2.0× higher performance compared with the previous design while achieving a power efficiency of 1.0 TOPS/W, which is 2.4× better than previous work. Compared to other ASIC designs with close peak throughput and power efficiency performance, the presented FPSNET has the smallest chip area (at least 42.4% reduction).
Read full abstract