Abstract

This paper presents an approach to system reliability modeling where failures and errors are not statistically independent. The repetition of failures and errors until their causes are removed is affected by the system processes and degrades system reliability. Four types of failures are introduced: hardware transients, software and hardware design errors, and program faults. Probability of failure, mean time to failure, and system reliability depend on the type of failure. Actual measurements show that the most critical factor for system reliability is the time after occurrence of a failure when this failure can be repeated in every process that accesses a failed component. An example involving measurements collected in an IBM 4331 installation validates the model and shows its applications. The degradation of system reliability can be appreciable even for very short periods of time. This is why the conditional probability of repetition of failures is introduced. The reliability model allows prediction of system reliability based on the calculation of the mean time to failure. The comparison with the measurement results shows that the model with process dependent repetition of failures approximates system reliability with better accuracy than the model with the assumption of independent failures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.