We discuss the estimation of derivatives of a performance measure using the likelihood ratio method in simulations of highly reliable Markovian systems. We compare the difficulties of estimating the performance measure and of estimating its partial derivatives with respect to component failure rates as the component failure rates tend to 0 and the component repair rates remain fixed. We first consider the case when the quantities are estimated using naive simulation; i.e., when no variance reduction technique is used. In particular, we prove that in the limit, some of the partial derivatives can be estimated as accurately as the performance measure itself. This result is of particular interest in light of the somewhat pessimistic empirical results others have obtained when applying the likelihood ratio method to other types of systems. However, the result only holds for certain partial derivatives of the performance measure when using naive simulation. More specifically, we can estimate a certain partial derivative with the same relative accuracy as the performance measure if the partial derivative is associated with a component either having one of the largest failure rates or whose failure can trigger a failure transition on one of the “most likely paths to failure.” Also, we develop a simple criterion to determine which partial derivatives will satisfy either of these properties. In particular, we can identify these derivatives using a sensitivity measure which can be calculated for each type of component. We also examine the limiting behavior of the estimates of the performance measure and its derivatives which are obtained when an importance sampling scheme known as balanced failure biasing is used. In particular, we show that the estimates of all derivatives can be improved. In contrast to the situation that arose when using naive simulation, we prove that in the limit, all derivatives can be estimated as accurately as the performance measure when balanced failure biasing is employed. Finally, we formalize the notion of a “most likely path to failure” in the setting of highly reliable Markovian systems. We accomplish this by proving a conditional limit theorem for the distribution of the sample paths leading to a system failure, given that a system failure occurs before the system returns to the state with all components operational. We use this result to establish our other results.
Read full abstract