Abstract

Our work investigates how to map loops efficiently onto Coarse-Grained Reconfigurable Architecture (CGRA). This paper examines the properties of CGRA and builds MapReduce inspired models for the loop parallelization problem. The proposed model has a more detailed performance metric and a more flexible unrolling scheme that can unroll different loop levels with different factors. A Geometric Programming based approach is proposed to resolve the optimization problem of loop parallelization problem. The proposed approach can find the optimal unrolling factor for each level loop, resulting in better parallelization of loops. Experimental results show that the proposed approach achieved up to 44% performance gain compared to the state-of-the-art loop mapping scheme.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call