Abstract

Coarse-grained reconfigurable architecture (CGRA) is a promising programmable hardware with high power-efficiency and high performance. However, compiling and optimizing loops with irregular branches on CGRAs is a challenge to fulfill the performance potential. Existing predication techniques, such as partial predication (PP) and full predication (FP), conservatively implement software pipeline with a static initiation interval (II) obtained from the maximum graph, and thus only parts of the graph in each loop iteration will be actually executed, resulting in underexploited performance. To exploit more loop-level parallelism for irregular branches, this article proposes a novel dynamic-II pipeline (DIP) scheme, which realizes a pipeline with variable II by accommodating multiple iterations of short path in one static configuration. Since the DIP scheme is effective to only certain types of branches, this article designs a hybrid compilation framework integrating other complementary methods, which selects the appropriate method for source programs according to a proposed performance evaluation model. Experimental results show that: 1) the hybrid compilation framework can effectively extract branch features, correctly choose and implement corresponding branch processing methods within acceptable compile time and 2) as compared to PP and FP, DIP brings a significant total execution time (TET) reduction by 27.21% and 22.04% on average when the execution probability of a short branch is 50%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call