Abstract

Deploying neural network (NN) models on Internet-of-Things (IoT) devices is important to enable artificial intelligence (AI) on the edge realizing AI-of-Things (AIoT). However, high energy consumption and bandwidth requirement of NN models restricts AI applications on battery-limited equipments. Compute-In-Memory (CIM), featured with high energy efficiency, provides new opportunities for the IoT deployment of NN. However, the design of CIM-based full system is still at the early stage, lacking system-level demonstration and vertical optimization for running end-to-end AI applications. In this paper, we demonstrate a low-power heterogeneous microprocessor System-on-Chip (SoC) with an all-digital SRAM CIM accelerator and rich data acquisition interfaces for end-to-end AIoT NN inference. A dedicated reconfigurable dataflow controller for CIM computation greatly lowers bandwidth requirement on the system bus and improves execution efficiency. The all-digital SRAM CIM array embeds NAND-based bit-serial multiplication within the readout sense amplifier balancing the storage density and system-level throughput. Our chip achieves a throughput of 12.8 GOPS, with 10 TOPS/W energy efficiency. Benchmarked by the four tasks in MLPerf Tiny, experimental results show 1.8x to 2.9x inference speedup over a baseline CIM processor.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.