Availability analysis is indispensable in evaluating the dependability of safety and business-critical systems, for which fault tree analysis (FTA) has proven very useful throughout research and industry. Fault trees (FT) can be analyzed by means of a rich set of mathematical models. One particular model are Bayesian networks (BNs) which have gained considerable popularity recently due to their powerful inference abilities. However, large-scale systems, as found in modern data centers for cloud computing, pose modeling challenges that require scalable availability models. An equivalent BN of a FT has no scalable representation for the k-out-of-n (k/n) voting gate because the conditional probability table that constitutes the k/n voting gate grows exponentially in n. Thus, the memory becomes the limiting factor.We propose a scalable k/n voting gate representation for BNs, based on the temporal noisy adder. The resulting model reduces the initial exponential to polynomial memory growth without a custom inference algorithm. Previous BN implementations of the k/n voting gate could only handle around 30 input events until memory limits make inference infeasible. However, our evaluation shows that our scalable model can handle more than 700 input events per gate, making it possible to evaluate large scale systems.
Read full abstract