Computation and communication schedule optimization for data-sharing tasks on uniprocessor

Jan-Jan Wu,Pangfeng Liu,En-Jan Chou

doi:10.1016/j.sysarc.2009.05.001

Abstract

Almost every computation task requires input data in order to find a solution. This is not a problem for a centralized system because data is usually available locally. However, in a parallel and distributed system, e.g., computation grids, the data may be in remote sites and must be transferred to the local site before the computation can proceed. As a result, the interleaved sequence of data transfer and job execution has a significant impact on the overall computational efficiency. In this paper, we analyze the computational complexity of the shared-data job scheduling problem on uniprocessor, with and without consideration of the storage capacity constraint on the local site. We show that if there is an upper bound on the server capacity, the problem is NP-complete, even when each job depends on at most two data items. For the case where there is no upper bound on the server capacity, we show that there exists an efficient algorithm that can provide an optimal job schedule when each job depends on at most two data items. We also propose an efficient heuristic algorithm that can determine good schedules for cases where there is no limit on the amount of data a job may access. The reported experiment results demonstrate that this heuristic algorithm performs very well, and derives near optimal solutions.

Full Text