Abstract

Electronic systems are prone to failures, whether during manufacture or throughout their in-service lifetime. A number of design and fabrication techniques are presently employed that maintain an economical production yield. However, the cost of through-life maintenance and fault mitigation operations for complex, high-value systems remains a major challenge and requires new design methods in order to increase their resilience. In this article, the focus is on applications that are sensitive to transient random errors caused by single-event upsets and multiple-bit upsets occurring within their electronic systems and sub-systems, as well as applications that benefit from fault detection and localisation. A novel self-restoration strategy is proposed based on a two-layer design approach comprising a fault-tolerant coordination layer with convergent cellular automata and a configurable functional logic layer. This design strategy is able to self-reconstruct the correct functional logic configuration in the event of transient faults without external intervention. The necessary convergent cellular automata rule set and state table sizes for 3 × 3 and 4 × 4 binary coded patterns are analysed in order to estimate the generic resource requirements for larger designs. Additionally, the possibility of exploiting the design for built-in fault detection and diagnostic reporting is investigated.

Highlights

  • The maintenance and repair of high-value systems is costly and in many cases requires significant investment at the design phase in order to limit the cost of through-life support

  • This paper presents an alternative design strategy that is based upon a self-recovering algorithm able to protect data patterns and logic configurations from SEU and MBU by continually refreshing the correct pattern at a fine-grained level

  • Previous designs have focused on feed-forward circuits, whose interconnect structure ensures that data paths do not cross and feed in from left to right or top to bottom, ensuring that the data flow between logic blocks mimics that of the convergent cellular automata (CCA) itself

Read more

Summary

Introduction

The maintenance and repair of high-value systems is costly and in many cases requires significant investment at the design phase in order to limit the cost of through-life support. While there is a continuing desire within the electronic systems domain towards the use of Commercial Off-the-Shelf (COTS) electronic components, even for mission critical systems such as space and avionics, such components are expected to fail more frequently in future due to the increasing influence of transient random single error events (SEEs) [1] This is especially true of integrated circuits (ICs) subjected to high energy radiation particles such as neutrons, where the conventional solution has been to adopt expensive radiation-hardened ICs. New design strategies for self-diagnosis and self-recovery in engineering systems will open up new opportunities for reducing the overall through-life cost of complex systems and has become an area of considerable interest in recent years [2]. This strategy is further able to recover in the extreme case of an upset occurring simultaneously within every bit of the configuration pattern, provided permanent damage does not occur

Design framework
Properties of the functional logic layer
Properties of the fault-tolerant coordination layer
CCA design strategy
Mitigation strategy for rule conflicts
LUT resource requirements
Analysis of 4x4 patterns
Conditions for self-configuration and recovery
Example design
Fault detection
Design and manufacturing considerations
Summary and conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call