Application-Specific Instruction set Processor (ASIP) has become an important design choice for embedded systems. It can achieve both high flexibility offered by the base processor core and high performance and energy efficiency offered by the dedicated hardware extensions. Although a lot of efforts have been devoted to computation acceleration, for example, automatic custom instruction identification and synthesis, limited on-chip data storage elements including the register file and data cache have become a potential performance bottleneck. For custom instructions that have more inputs and/or outputs than the generic register file I/O ports, custom registers are added in ASIPs to satisfy the need of additional inputs and outputs, and traditionally they are used only by custom instructions. In this article, we propose a hardware/software cooperative approach with a linear scan register allocation algorithm, which allows base instructions to utilize the existing custom registers in ASIPs for eliminating register spills of the program. The data traffic between the base processor and off-chip memory can be replaced with energy-efficient on-chip communications between the processor core and custom hardware extensions. Our experimental results demonstrate that a significant performance gain can be achieved, orthogonal to improvements by other techniques in ASIP design.
Read full abstract