Abstract
Coarse-Grained Reconfigurable Arrays (CGRAs) provide high performance, energy-efficient execution of the innermost loops of an application. Most real-world applications, however, comprise of deeply-nested loops with complex and often irregular control flow structures that cannot be mapped to CGRAs by existing compilers. This leads to excessive data transfer costs as the execution continuously alternates between the outer loop-nests on the host processor and the innermost loop on the CGRA accelerator. Moreover, ultra-low power CGRAs can only include limited on-chip memory to cache the configuration bitstreams and need frequent swapping of configurations in the presence of multiple innermost loops. We introduce DNestMap, a partitioning and mapping tool for CGRAs, that can judiciously extract the most beneficial code segments of multiple deeply-nested loops and effectively cache them together statically in the configuration memory through spatio-temporal partitioning. DNestMap achieves 1.58X performance improvement compared to dynamic caching of configuration contexts of the innermost loops in the CGRAs with limited on-chip memory.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.