Abstract

In this paper, we propose a data partitioning technique for the memory subsystem that consists of a multi-ported scratchpad memory (SPM) unit and a single-ported data cache in coarse-grained reconfigurable arrays (CGRA) architecture. The embedded reconfigurable processor executes programs by switching between the Non-VLIW and VLIW modes depending on the type of the code region to achieve high performance. The VLIW mode exploits code regions with high ILP that require high memory bandwidth and the Non-VLIW mode exploits those with low ILP that require low memory latency. Our data partitioning technique between the SPM and the data cache is based on data interference graph reduction and profiling information. Given an SPM size, it finds the optimal data partitions by taking the VLIW instruction schedule into consideration. We evaluate our data partitioning technique for the CGRA architecture with three representative multimedia applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call