Abstract

The multi-level computing architecture (MLCA) is a novel system-on-chip architecture for embedded systems designed to exploit task-level and instruction-level parallelism in multimedia applications. The MLCA provides a unique two-level programming model that simplifies the development of embedded applications. To cope with increasing intra-system communication delays, we introduce a distributed memory version of the MLCA where separate storage is used for global and local application data. Global data is stored on multiple on-chip scratch-pad memories (SPMs) with non-uniform-memory access (NUMA) latencies, while local data is stored on PU-private memories. In such designs, one of the key factors affecting application performance is the locality of access to global data. We introduce programming constructs and run-time support to dynamically manage data stored in the SPMs and to influence run-time task scheduling. Collectively, our techniques improve performance by 6%-40%, compared to simple static memory management and scheduling approaches

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call