Abstract

This paper describes the design and initial implementation of a software framework for exploiting resizability in distributed-memory parallel applications. By “resizable” we mean the ability at run-time to expand or contract the number of processes participating in a parallel application. The ReSHAPE framework described here includes a cluster scheduler, a library supporting data redistribution and process remapping, and an application programming interface (API) which allows applications to interact with the scheduler and resizing library with only minor code modifications. Parallel applications executed using the ReSHAPE framework can expand to take advantage of additional free processors or contract to accommodate a high priority application without being suspended. Experimental results show that the ReSHAPE framework can significantly improve individual job turn-around time and overall system throughput, even with very simple application scheduling policies. In addition, the framework serves as a convenient platform for research into much more sophisticated cluster scheduling policies and methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call