Abstract

Remote GPU execution has been proven to increase GPU occupancy and reduce job waiting time in multi-GPU batch-queue systems, by allowing jobs to utilize remote GPUs when there are not enough unoccupied local GPUs available. However, for GPU communication intensive applications, remote GPU communication overhead may account for more than 70% of the applications' execution times. The need for using a remote GPU exists when there are not enough local GPUs available on a node assigned to the job, but a local GPU could become available afterward. We propose mrCUDA, a middleware for migrating execution on a remote GPU to a local GPU on-demand. Our evaluation shows that for long-running jobs mrCUDA overhead accounts for less than 1% of their total execution times. In addition, by applying mrCUDA to the first-come-first-serve (FCFS) job scheduling algorithm, we could reduce job lifetimes (waiting + execution times) as much as 30% on average without changing the scheduling policy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.