Abstract
Coarse-Grained Reconfigurable Architectures (CGRAs) are a promising solution to domain-specific applications for their energy efficiency and flexibility. To improve performance on CGRA, modulo scheduling is commonly adopted on Data Dependence Graph (DDG) of loops by minimizing the Initiation Interval (II) between adjacent loop iterations. The mapping process usually consists of scheduling and placement-and-routing (P&R). As existing approaches don’t fully and globally explore the routing strategies of the long dependencies in a DDG at the scheduling stage, the following P&R is prone to failure leading to performance loss. To this end, this paper proposes a routability-enhanced scheduling for CGRA mapping using Integer Linear Programming (ILP) formulation, where a global optimized scheduling could be found to improve the success rate of P&R. Experimental results show that our approach achieves <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1.12\times $ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1.22\times $ </tex-math></inline-formula> performance speedup, 28.7% and 50.2% compilation time reduction, as compared to 2 state-of-the-art heuristics.
Highlights
With both energy efficiency and program flexibility, Coarse-Grained Reconfigurable Architectures (CGRAs) are becoming attractive alternatives for embedded systems
ROUTING STRATEGIES OF LONG DEPENDENCE As shown in Fig. 3, the three types of routing strategies are illustrated with a 4-node Dependence Graph (DDG) mapping on a 1×2 Processing Element Array (PEA) time extended to 4 time steps, where the yellow cell and blue cell indicate Function Unit (FU) and Local Register File (LRF), respectively
In mesh-plus routing CGRA, FUs are connected in a mesh network, where each FU is connected to its immediate neighbors and its near neighbors with 1 hop
Summary
With both energy efficiency and program flexibility, Coarse-Grained Reconfigurable Architectures (CGRAs) are becoming attractive alternatives for embedded systems. We note that as routing strategies are not fully (at both initial scheduling and rescheduling stage) and globally (on the whole DDG rather than unmapped nodes) explored, the following P&R is prone to failure leading to performance loss. To this end, as the third row shown, this paper proposes a RoutabilityEnhanced Scheduling for Mapping (RESMap) applications on CGRAs, supporting global routing explorations in both.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have