Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures

Shouyi Yin,Shaojun Wei,Xianqing Yao,Leibo Liu,Dajiang Liu

doi:10.1109/tvlsi.2015.2474129

Abstract

The coarse-grained reconfigurable architectures (CGRAs) are a promising class of architectures with the advantages of high performance and high power efficiency. The compute-intensive parts of an application (e.g., loops) are often mapped onto the CGRA for acceleration. Due to the extra overhead of memory access and the limited communication bandwidth between the processing element (PE) array and local memory, previous works trying to solve the routing problem are mainly confined in the internal resources of PE arrays (e.g., PEs and registers). Inevitably, routing with PEs or registers will consume a lot of computational resources and cause the increase of the initiation interval. To solve this problem, this paper makes two contributions: 1) establishing a precise formulation for the CGRA mapping problem while using shared local data memory as a routing resource and 2) extracting an effective approach for mapping loops to CGRAs. The experimental results on loops of the SPEC2006, Livermore, and MiBench show that our approach (called MEMMap) can improve the performance of the kernels on CGRA up to $1.62\times $ , $1.58\times $ , $1.28\times $ , and $1.23\times $ compared with the edge-centric modulo scheduling, EPIMap, REGIMap, and force-directed map, respectively, with an acceptable increase in compilation time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Lead the way for us

Journal: IEEE Transactions on Very Large Scale Integration (VLSI) Systems	Publication Date: May 1, 2016
Citations: 63

Similar Papers

An efficient compilation of coarse-grained reconfigurable architectures utilizing pre-optimized sub-graph mappings
Ayaka Ohwada ... Takuya Kojima
-
Ayaka Ohwada, et. al.Ayaka Ohwada ... Takuya Kojima
01 Mar 2022
01 Mar 2022

Towards Higher Performance and Robust Compilation for CGRA Modulo Scheduling
Zhongyuan Zhao ... Wenzhi Yin
IEEE Transactions on Parallel and Distributed Systems | VOL. 31
Zhongyuan Zhao, et. al.Zhongyuan Zhao ... Wenzhi Yin
01 Sep 2020
IEEE Transactions on Parallel and Distributed Systems | VOL. 31

A static-placement, dynamic-issue framework for CGRA loop accelerator
Zhongyuan Zhao ... Weifeng He
-
Zhongyuan Zhao, et. al.Zhongyuan Zhao ... Weifeng He
01 Mar 2017
01 Mar 2017

Time sharing of Runtime Coarse-Grain Reconfigurable Architectures processing elements in multi-process systems
Benjamin Carrion Schafer
-
Benjamin Carrion SchaferBenjamin Carrion Schafer
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems