Abstract

Co-scheduling of jobs in data centers is a challenging scenario where jobs can compete for resources, leading to severe slowdowns or failed executions. Efficient job placement on environments where resources are shared requires awareness on how jobs interfere during execution, to go far beyond ineffective resource overbooking techniques.Current techniques, most of which already involve machine learning and job modeling, are based on workload behavior summarization over time, rather than focusing on effective job requirements at each instant of the execution. In this work, we propose a methodology for modeling co-scheduling of jobs on data centers, based on their behavior towards resources and execution time and using sequence-to-sequence models based on recurrent neural networks. The goal is to forecast co-executed jobs footprint on resources throughout their execution time, from the profile shown by the individual jobs, in order to enhance resource manager and scheduler placement decisions.The methods presented herein are validated by using High Performance Computing benchmarks based on different frameworks (such as Hadoop and Spark) and applications (CPU bound, IO bound, machine learning, SQL queries...). Experiments show that the model can correctly identify the resource usage trends from previously seen and even unseen co-scheduled jobs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call