Multi-job Scheduling Research Articles

Networking has become a well-known performance bottleneck for distributed machine learning (DML). Although lots of works have focused on accelerating the communication process of DML, they ignore the impact of the physical network on the DML performance. Concurrently, optical circuit switches (OCSes) are increasingly applied in data centers and clusters, which can fundamentally improve DML performance. It is worth noting that the non-negligible OCS reconfiguration delay makes OCS scheduling algorithms have a great impact on the upper application performance. However, existing OCS scheduling solutions are not suitable for DML jobs due to the iterative nature of DML jobs and their interleaving characteristics of communication and computation stages. Therefore, in this paper, we study the online multi-job scheduling for DML in OCS networks. Firstly, we propose heaviest-load-first (HLF), a heuristic algorithm for intra-job scheduling, which is based on the fact that the completion time of flows on the heaviest load port has a significant impact on the job completion time. Furthermore, we present Shortest Weighted Remaining Time First (SWRTF) algorithm for inter-job scheduling. In SWRTF, an available DML job is scheduled when the served job moves from communication stage to the computation stage, which significantly improves the circuit utilization. Based on large-scale simulations, we demonstrate HLF can significantly reduce the iteration communication time by up to 64.97% compared to the state-of-the-art circuit scheduler Sunflow. Besides, SWRTF can save up to 42.9%, 54.2%, 27.2% of Weighted-Job-Completion-Time (WJCT) compared to Shortest-Job-First, Baraat and Weighted-First inter-job scheduling algorithms, respectively.

Read full abstract

Cloud computing is a large model change of computing system. It provides high scalability and flexibility among an assortment of on-demand services. To imporve the performance of the multi-cloud environment in distributed application might require less energy efficiency and minimal inter-node latency correspondingly. The major problem is that the energy efficiency of the cloud computing data center is less if the number of server is low, else it increases. To overcome the energy efficiency and network latency problem a novel energy-efficient particle swarm optimization representation for multi-job scheduling and Latency representation for the grouping of nodes with respect to network latency is proposed. The scheduling procedure is through on the basis of network latency and energy efficiency. Scheduling schema is the main part of Cloud Scheduler component, which helps the scheduler in scheduling decision on the base of dissimilar criterion. It also works well with incomplete latency information and performs intelligent grouping on the basis of both network latency and energy efficiency. Design a realistic particle swarm optimization algorithm for the cloud servers and construct an overall energy competence based on the purpose of the servers and calculation of fitness value for each cloud servers. Also, in order to speed up the convergent speed and improve the probing aptitude of our algorithm, a local search operative is introduced. Finally, the experiment demonstrates that the proposed algorithm is effectual and well-organized.

Read full abstract

Multi-job Scheduling Research Articles

Articles published on Multi-job Scheduling

Multi-Job Scheduling and Optimization Model for Cloud Computing

Online job scheduling for distributed machine learning in optical circuit switch networks

Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

An energy-aware bi-level optimization model for multi-job scheduling problems under cloud computing

Energy-Efficient PSO and Latency Based Group Discovery Algorithm in Cloud Scheduling

Improving Performance in Cloud using Multi-Job Scheduling based Group Discovery Algorithm

Resource intensity aware job scheduling in a distributed cloud

A new multi-objective bi-level programming model for energy and locality aware multi-job scheduling in cloud computing

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multi-job Scheduling Research Articles

Articles published on Multi-job Scheduling

Multi-Job Scheduling and Optimization Model for Cloud Computing

Online job scheduling for distributed machine learning in optical circuit switch networks

Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

An energy-aware bi-level optimization model for multi-job scheduling problems under cloud computing

Energy-Efficient PSO and Latency Based Group Discovery Algorithm in Cloud Scheduling

Improving Performance in Cloud using Multi-Job Scheduling based Group Discovery Algorithm

Resource intensity aware job scheduling in a distributed cloud

A new multi-objective bi-level programming model for energy and locality aware multi-job scheduling in cloud computing