To improve the system reliability while conserving the limited system resources, cold standby sparing is often used. In computing tasks, because active components fail randomly, and the standby component has to pick up the mission task whenever required, scheduled backups are often implemented to save the completed portions of the task. The backups can facilitate an effective system recovery where the standby component can take over the mission task from the last backup point instead of resuming the mission task from the very beginning. This paper considers a $k$ -out-of- $n$ cold standby system subject to scheduled backups, where $k$ components are online and operating, with the remaining components waiting in the unpowered, cold standby mode. Whenever an online component fails, a cold standby component is activated to take over the mission task from the last backup point. The backup intervals are deterministic, but can be even or uneven. As the component may fail due to an imperfect switching from the standby state to the fully powered up state, the switching failure is also considered in the system model. A multi-valued decision diagram (MDD)-based analytical approach is proposed to evaluate the reliability of the considered system, and its complexity is analyzed. The proposed method is applicable to systems with non-identical components following arbitrary lifetime distributions. Examples are given to illustrate the MDD-based method. The correctness and efficiency of the proposed method are verified using Monte Carlo simulations.
Read full abstract