The resource-constrained MCU-based platform is unable to use high-performance accelerators such as GPUs or servers due to insufficient resources for ML applications. We define a Micro-Accelerator (MA) that can accelerate ML operations by being connected to the on-chip bus peripheral of the MCU core. ML applications using general-purpose accelerators have a well-equipped SDK environment, making design and verification flow straightforward. In contrast, MA must be connected to the MCU core and on-chip bus interface within the chip. However, evaluating the interaction between the MCU core and an MA is challenging, as it requires the MA to connect with the core and the on-chip bus interface during target software execution. The cost of fabricating physical MA hardware is enormous, compounded by licensing issues with commercial cores. We propose a MA-in-the-loop (MAIL) framework that integrates a custom-designed MA into an emulation platform. This platform enables virtual execution by loading software onto the MCU, allowing observation of hardware-software interactions during ML execution. The proposed framework in this paper is a mixture of software that can emulate the environment in which general ML applications run on the MCU and RTL simulations to profile the acceleration on the MA. To evaluate the flow of ML software execution and performance changes according to the various architectures of MA in the framework, the MA can be reconfigured at runtime to explore the design space. To benchmark our proposed framework, we compared TinyML application profiles to the pure software execution. Experimental results show that the MA-accelerated framework performs comparably to actual MCUs, validating the efficacy of the proposed approach.
Read full abstract