Abstract

Loop bounds are often unknown until run time, making it difficult to analyze non-functional properties such as latency at compile-time. Similarly, static allocations of processing resources to loop computations might be too conservative with respect to given performance requirements, or not optimal with respect to the energy consumption. To still satisfy requirements when accelerating loop nests under this uncertainty of loop bounds, we formalize and propose an approach to run-time requirement enforcement: at run time, select a mapping among a set of candidates that satisfies a given set of requirements while optimizing secondary objectives. Because the candidate search space of suitable mappings might be prohibitively large to evaluate at run time, we further introduce two approaches to reduce its cardinality: 1) architecture-specific reduction by solving for parts of the mapping from the requirements, and 2) design-time reduction by finding a k-subset of mappings that maximizes the number of loop bounds where the requirements are satisfied. We implemented our proposed run-time requirement enforcement techniques for a representative class of programmable processor array architecture called tightly coupled processor arrays (TCPAs) and demonstrate their effectiveness with a case study. The case study shows the effectiveness of our approach: We can satisfy given latency requirements while easily saving up to 10 % in energy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call