Abstract
Self checking components are widely used to realize highly reliable systems. The design of a computer-board with self checking capability requires the employment of multiple mechanisms for the detection of faults and errors to achieve short error detection latency and high error detection coverage. Literature suggests a lot of such mechanisms and describes their effectiveness. However, without a good strategy their combination will very likely result in a system where some types of error are detected by several mechanisms simultaneously, whereas other types remain undetected. Moreover, the permissible number of mechanisms is limited by cost and complexity of the system. Consequently the choice of the mechanisms is a crucial decision that has to be made with special care. This paper summarizes the main guidelines for such a choice. Special attention is directed to hardware mechanisms and their importance within a global concept for fault tolerance. A current research project is introduced that aims at elaborating a policy for efficient combination of checking mechanisms. This shall be accomplished by a theoretical and experimental evaluation of characteristic examples.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have