Obtaining an optimal schedule for a set of precedence-constrained tasks is a well-known NP-complete problem in its general form. In view of the intractability of the problem, most of the previous work relies on heuristics that try to find reasonably high quality solutions in an acceptable amount of time. While optimal polynomial-time algorithms are known only for a few simple cases (and in other cases can only be obtained through an exhaustive search with prohibitively high time complexity), they may be critically important for applications in which performance is the prime objective. Optimal solutions can also serve as a reference to test the performance of various heuristics. Moreover, an optimal schedule for a program at hand needs to be determined only once (and off-line) but the program using that schedule is in general executed several times. In this paper, we propose optimal algorithms for static scheduling of task graphs with arbitrary parameters to multiple homogeneous processors. The first algorithm is based on the A * search technique and uses a computationally efficient cost function for guiding the search with reduced complexity. Additionally, we propose a number of effective state-pruning techniques to reduce the search space. For further lowering the complexity, we propose an efficient parallelization of the search algorithm. We parallelize the algorithm with reduced interprocessor communication as well as with static and dynamic load-balancing schemes to evenly distribute the search states to the processors. We also propose an approximate algorithm that guarantees a bounded deviation from the optimal solution but executes in a considerably shorter time. Based on an extensive experimental evaluation of the algorithms, we conclude that the parallel algorithm with pruning techniques is an efficient scheme for generating optimal solutions of reasonably large problems while the approximate algorithm is effective if slightly degraded solutions are acceptable.
Read full abstract