Abstract

Instructions for concurrent processing of smaller data units than whole CPU words are useful in areas like multimedia processing and cryptography. Since the processors used in FPGA-based embedded systems lack support for such applications, this paper proposes mapping sequences of subword operations to a set of hardware components and generating the corresponding FPGA partial configurations at run-time. The technique is aimed at adaptive embedded systems that employ run-time reconfiguration to achieve high flexibility and performance. New partial configurations for circuits implementing sets of subword operations are created by merging together the relocated partial configurations of the hardware components (from a predefined library), and the configurations of the switch matrices used for the connections between the components. The paper presents and discusses results obtained for a 300MHz PowerPC CPU in a Virtex-II Pro platform FPGA. For the set of benchmarks analyzed, the complete configuration creation process takes between 1s and 24s. The run-time generated hardware versions achieve speed-ups between 11 and 73 over the software versions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call