Abstract

Scientific applications modelled as directed acyclic graphs (DAGs) are composed of complex calculations and a large amount of data transfer. It is very difficult to execute these applications on traditional distributed computing platforms. For such applications, cloud provides a reliable solution due to its unique characteristics, task clustering is performed which combines two or more tasks into a single executable unit. Task clustering can help to reduce the system overheads such as queue delay, engine delay and so on. Existing clustering algorithms in this domain focus more on computational granularity of the tasks without considering the data dependency among the tasks. In this paper, a data aware clustering algorithm has been proposed which combines the tasks depending on the size of data transferred between interdependent tasks. Experiments were conducted to compare the proposed clustering algorithm with the existing baseline and balanced clustering algorithms and it was observed that proposed algorithm gave better makespan and cost for data intensive workflow applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.