Abstract

Since the initial introduction of open source cluster application resources (OSCAR), this software package has been a well-accepted choice for building high performance computing systems. As it continues to be applied to mission-critical environments, high availability (HA) features therefore are needed to be included in OSCAR cluster. In this paper, we provide a HA solution for OSCAR cluster. As a widely used technique in HA solutions, component redundancy is adopted to improve the system availability. Based on the proposed architecture, we develop a detailed failure-repair model for predicting the availability of HA OSCAR cluster. stochastic reward nets (SRN) are used to model the behavior of the system. We specify our SRN model to stochastic Petri net package, describe and compute several interesting output measures that characterize availability features of HA OSCAR cluster.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call