Abstract
Cloud IaaS platforms readily provide access to homogeneous multi-core machines, whether they are physical ("bare metal") or virtual machines. Each of these machines can be equipped with high-performance SSD disks, enabling the distribution of workflow-generated files across multiple machines, which helps minimize the overhead associated with data transfers. In this paper, we propose a scheduling algorithm called SMDT-ERU (Scheduling for Minimizing Data Transfer - Enhancing Resource Utilization), designed to reduce the makespan of data-intensive workflows by minimizing data transfers between dependent tasks over the network. Intermediate files generated by tasks are stored locally on the disk of the machine where the tasks are executed. Through experimentation, we confirm that increasing the number of cores per machine reduces the additional costs caused by network data transfers. Real-world workflow experiments demonstrate the advantages of the proposed algorithm. Our data-driven scheduling approach significantly reduces execution time and the volume of data transferred over the network, outperforming one of the leading state-of-the-art algorithms, which we have adapted to fit our assumptions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER RESEARCH
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.