Abstract
Electronic systems are prone to failures, whether during manufacture or throughout their in-service lifetime. A number of design and fabrication techniques are presently employed that maintain an economical production yield. However, the cost of through-life maintenance and fault mitigation operations for complex, high-value systems remains a major challenge and requires new design methods in order to increase their resilience. In this article, the focus is on applications that are sensitive to transient random errors caused by single-event upsets and multiple-bit upsets occurring within their electronic systems and sub-systems, as well as applications that benefit from fault detection and localisation. A novel self-restoration strategy is proposed based on a two-layer design approach comprising a fault-tolerant coordination layer with convergent cellular automata and a configurable functional logic layer. This design strategy is able to self-reconstruct the correct functional logic configuration in the event of transient faults without external intervention. The necessary convergent cellular automata rule set and state table sizes for 3 × 3 and 4 × 4 binary coded patterns are analysed in order to estimate the generic resource requirements for larger designs. Additionally, the possibility of exploiting the design for built-in fault detection and diagnostic reporting is investigated.
Highlights
The maintenance and repair of high-value systems is costly and in many cases requires significant investment at the design phase in order to limit the cost of through-life support
This paper presents an alternative design strategy that is based upon a self-recovering algorithm able to protect data patterns and logic configurations from SEU and MBU by continually refreshing the correct pattern at a fine-grained level
Previous designs have focused on feed-forward circuits, whose interconnect structure ensures that data paths do not cross and feed in from left to right or top to bottom, ensuring that the data flow between logic blocks mimics that of the convergent cellular automata (CCA) itself
Summary
The maintenance and repair of high-value systems is costly and in many cases requires significant investment at the design phase in order to limit the cost of through-life support. While there is a continuing desire within the electronic systems domain towards the use of Commercial Off-the-Shelf (COTS) electronic components, even for mission critical systems such as space and avionics, such components are expected to fail more frequently in future due to the increasing influence of transient random single error events (SEEs) [1] This is especially true of integrated circuits (ICs) subjected to high energy radiation particles such as neutrons, where the conventional solution has been to adopt expensive radiation-hardened ICs. New design strategies for self-diagnosis and self-recovery in engineering systems will open up new opportunities for reducing the overall through-life cost of complex systems and has become an area of considerable interest in recent years [2]. This strategy is further able to recover in the extreme case of an upset occurring simultaneously within every bit of the configuration pattern, provided permanent damage does not occur
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have