Abstract

Markov models are often used to evaluate dependability attributes of fault-tolerant computer systems. The use in practice of Markov models is, however, hampered by the well-known state space explosion problem. Simulation alleviates the problem. For Markov models of repairable fault-tolerant systems, standard simulation of dependability measures tends to be expensive due to the rarity of the system failure event. Importance sampling can speed up the simulation. This paper develops two importance sampling schemes, called failure transition distance biasing & balanced failure transition distance biasing, which exploit the failure transition distance concept in an attempt to improve the efficiency of two other schemes, failure biasing & and balanced failure biasing. The schemes require the computation of the so-called failure transition distances, and procedures to perform those computations are developed. The presentation is tied to a previously proposed measure-specific simulation method for the steady-state unavailability. An optimization method of the parameters of the importance sampling schemes is also developed. For the simulation of the steady-state unavailability, failure transition distance biasing has (as failure biasing) the bounded relative error property for balanced fault-tolerant systems & balanced failure transition distance biasing has (as balanced failure biasing) the bounded relative error property for both balanced & unbalanced fault-tolerant systems. It is proved that, for balanced fault-tolerant systems, both failure transition distance biasing & balanced failure transition distance biasing can indeed improve the efficiency of failure biasing & balanced failure biasing. In addition, numerical experiments seem to indicate that, for unbalanced fault-tolerant systems, balanced failure transition distance biasing can also improve the efficiency of balanced failure biasing. The application of the failure transition distance-based importance sampling schemes is, however, limited to systems not having too many minimal failure covers, or, at least, not having too many minimal failure covers of small cardinality. A minimal failure cover is a minimal bag of failure bags such that the failure of its components implies the failure of the system; a failure bag is any non-empty bag of component classes which can fail simultaneously.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call