Dataflow modeling is well suited for a large variety of applications for modern multi-core architectures, e.g., from the signal processing and the control domain. Furthermore, Design Space Exploration (DSE) can be used to explore mappings of tasks to hardware resources (cores of an MPSoC) and their scheduling to obtain optimized trade-off solutions between throughput and resource costs. However, the throughput evaluation of an implementation candidate via compilation-in-the-loop or simulation-based approaches can be extremely time-consuming. Such a deficiency is very detrimental, because a typical DSE run needs to evaluate thousands of implementation candidates. As a remedy, we propose a hybrid-adaptive DSE where a max-plus algebra-based analytic throughput calculation method is used in the initial DSE phase to enable a fast progress of the search space exploration. However, as this analysis may be inaccurate as neglecting some real-world effects like cache and scheduling overhead, throughput measurements are taken later in the DSE. Moreover, we explore the trade-off between scheduling efficiency of implementation candidates—in favor of reducing concurrency—and exploiting concurrency to a large extent for parallel execution of the application. To find solutions of highest achievable throughput, it is shown that not only highly scheduling efficient implementation candidates but also highly parallel implementation candidates are essential when determining the initial population. In this realm, we contribute a method for diversity-based population initialization. For a representative set of benchmarks, it is shown that the combination of the two major contributions allows us to find much higher throughput multi-core solutions within a given exploration time compared to a state-of-the-art DSE approach.
Read full abstract