Abstract

Frequent service down times and poor system performance can affect aspects such as the availability, quality of experience and generate millions of dollars in lost revenue. High Performance Computing (HPC) environments are often required to comply with performance and dependability requirements. The CHESS methodology provides support for the design and the evaluation of dependability and performance system attributes. In this paper we extend the CHESS methodology to support the design and the dependability analysis of HPC environments. The proposed approach was employed in the Grid’5000, a highly distributed and I/O intensive HPC environment. The application of the proposed approach provided key information for demonstrating dependability, deriving project decisions, agreeing on new design choices and resource allocation strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call