Abstract

This paper proposes FuMicro, a fused microarchitecture integrating both in-order superscalar and Very Long Instruction Word (VLIW) in a single core. A processor with FuMicro microarchitecture can work under alternative in-order superscalar and VLIW mode, using the same pipeline and the same Instruction Set Architecture (ISA). Small modification to the compiler is made to expand the register file in VLIW mode. The decision of mode switch is made by software, and this does not need extra hardware. VLIW code can be exploited in the form of library function and the users will be exposed under only superscalar mode; by this means, we can provide the users with a convenient development environment. FuMicro could serve as a universal microarchitecture for it can be applied to different ISAs. In this paper, we focus on the implementation of FuMicro with ARM ISA. This architecture is evaluated on gem5, which is a cycle accurate microarchitecture simulation platform. By adopting FuMicro microarchitecture, the performance can be improved on an average of 10%, with the best performance improvement being 47.3%, compared with that under pure in-order superscalar mode. The result shows that FuMicro microarchitecture can improve Instruction Level Parallelism (ILP) significantly, making it promising to expand digital signal processing capability on a General Purpose Processor.

Highlights

  • With the evolution of wireless communication protocols, digital signal processing becomes more and more demanding in applications of embedded systems

  • We firstly compile the C code into assembling language, from which we pick the code sections that are most suitable to be executed in Very Long Instruction Word (VLIW) mode and rewrite the code into VLIW pattern

  • It is difficult to insert useful instructions into the branch delay slots, and this may case several cycles of performance loss. These programs are not suitable to be executed in VLIW mode in nature

Read more

Summary

Introduction

With the evolution of wireless communication protocols, digital signal processing becomes more and more demanding in applications of embedded systems. As digital signal processors (DSPs) become increasingly indispensable, many embedded systems embrace both General Purpose Processor (GPP) cores and DSP cores. Many SoCs use ARM+DSP architecture [1,2,3,4] in recent years. Architectures incorporating GPP and DSP have their common headaches. The GPP and DSP require different instruction sets and they need independent development environment, which brings overwhelming workload to software design, making such architectures time and effort consumptive and inconvenient to the users at the same time. Communication between GPP and DSP brings more overhead [6]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call