Abstract

Currently users of high performance computers are overwhelmed with non-scalable tasks such as job submission and monitoring. Many users are limited by the number of jobs they can submit to one High Performance Computing (HPC) resource at a time, which results in very long queue times. Digital Sherpa is a grid application for executing jobs on many separate HPC resources at a time, which can reduce total queue time. It automates non-scalable tasks such as job submission and monitoring, and includes recovery features such as resubmission of failed jobs. Digital Sherpa has been implemented for MGAC, a parallel distributed application for the prediction of atomic clusters and crystal structures using Genetic Algorithms. Success has been found using Digital Sherpa in a prototype of an HPC oriented combustion simulation application as well as on the TeraGrid. The high level goal is to allow Digital Sherpa to interoperate with any HPC application.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call