Abstract

Distance-Ranked Fault Identification (DRFI)is a dynamic reconfiguration technique which employs runtime inputs to conduct online functional testing of fielded FPGA logic and interconnect resources without test vectors. At design time, a diverse set of functionally identical bitstream configurations are created which utilize alternate hardware resources in the FPGA fabric. An ordering is imposed on the configuration pool as updated by the PageRank indexing precedence. The configurations which utilize permanently damaged resources and hence manifest discrepant outputs, receive lower rank are thus less preferred for instantiation on the FPGA. Results indicate accurate identification of fault-free configurations in a pool of pregenerated bitstreams with a low number of reconfigurations and input evaluations. For MCNC benchmark circuits, the observed reduction in input evaluations is up to 75% when comparing the DRFI technique to unguided evaluation. The DRFI diagnosis method is seen to isolate all 14 healthy configurations from a pool of 100 pregenerated configurations, and thereby offering a 100% isolation accuracy provided the fault-free configurations exist in the design pool. When a complete recovery is not feasible, graceful degradation may be realized which is demonstrated by the PSNR improvement of images processed in a video encoder case study.

Highlights

  • The self-reconfiguration capability of FPGAs has been identified as a useful feature for realizing designs which are resilient to local permanent faults as well as mitigating transistor aging degradations [1]

  • While the Processing Elements (PEs) design we have considered in the DCT core consumes fewer resources than the capacity of a Partial Reconfiguration Regions (PRRs), we choose the same as the minimum PRR size constrained by the vendor’s tool and FPGA device under consideration

  • The experiments indicate that the approach is effective at identifying the correct configuration in a fraction of the comparisons needed by unguided search and thereby offering considerably improved throughput

Read more

Summary

Introduction

The self-reconfiguration capability of FPGAs has been identified as a useful feature for realizing designs which are resilient to local permanent faults as well as mitigating transistor aging degradations [1]. Recovery from local permanent damage in FPGA-based designs can be realized by reconfigurations to utilize fault-free logic resources at runtime. Given some faulty resources in a particular region on an FPGA chip, the lost functionality can be refurbished by utilizing a pristine area of the chip. If a circuit realized by a particular bitstream manifests an observable fault, an alternate bitstream utilizing only fault-free resources can be downloaded into a reconfigurable region. A Concurrent Error Detection (CED) scheme [2] is a well-established low-latency spatial-redundancy approach to fault detection. Such circuits are instantiated with a single replicated module to realize a Duplex Modular Redundancy (DMR) arrangement. If autonomous recovery capability is desired, after fault detection, an efficient fault recovery technique is sought which is the subject of this paper

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call