We investigate the semi-online problem of MapReduce scheduling on two parallel machines. We aim to minimize the makespan. Jobs are released over-list, and each job includes a map task and a reduce task. The job’s map task can be preemptive and scheduled simultaneously onto different machines, however, the reduce task is non-preemptive. The job’s reduce task needs to wait for its map task to complete before starting. We consider the following two versions: Firstly, we know the processing time of the largest reduce task beforehand, and then design a 4/3-competitive optimal semi-online algorithm. Secondly, we know in advance the value of the reduce task with the largest processing time and the the total sum of the processing times. Then we present a 4/3-competitive semi-online algorithm. We conclude that the algorithm is the best possible when the largest reduce task meets certain conditions.
Read full abstract