Abstract

The rate of transient faults has increased significantly as the technology scales up. The tolerance of transient faults has become an important issue in the system design. Dual modular redundancy (DMR) and triple modular redundancy (TMR) are two commonly used techniques that can achieve fault detection and masking through executing redundant tasks. As DMR and TMR have different time and cost overheads, we must carefully determine which one should be used for each task (i.e., task hardening) to achieve the optimal system design. Furthermore, for multi-core systems, the system-level design includes the allocation of cores for the tasks (i.e., task mapping) as well. This paper aims at task hardening and mapping simultaneously for independent tasks on multi-cores with heterogeneous performances, in order to minimize the maximum completion time of all tasks (i.e., makespan). We demonstrate that once task hardening is given, task mapping of independent tasks can be achieved by employing min–max-weight perfect matching with a polynomial time complexity. Besides, as there is a trade-off between cost and time performance, we propose a multi-objective memetic algorithm (MOMA)-based task hardening method to obtain a set of solutions with different numbers of cores (i.e., costs), so the designer can choose different solutions according to different requirements. The key idea of the MOMA is to incorporate problem-specific knowledge into the global search of evolutionary algorithms. Our experimental studies have demonstrated the effectiveness of the proposed method and have shown that by combining the results of MOMA and MOEA we can provide a designer with a highly accurate set of solutions within a reasonable amount of time.

Highlights

  • The rapid technology advances, such as transistor size scaling, high operation frequency, and low voltage supply, have led to multiple reliability threats to circuits and systems (Constantinescu 2003)

  • We demonstrate that once task hardening is given, task mapping of independent tasks can be achieved by employing min–max-weight perfect matching with a polynomial time complexity

  • As there is a trade-off between cost and time performance, we propose a multi-objective memetic algorithm (MOMA)-based task hardening method to obtain a set of solutions with different numbers of cores, so the designer can choose different solutions according to different requirements

Read more

Summary

Introduction

The rapid technology advances, such as transistor size scaling, high operation frequency, and low voltage supply, have led to multiple reliability threats to circuits and systems (Constantinescu 2003). As the technology keeps scaling down, more and more cores can be integrated into a single chip to meet the growing computing demand of modern applications. Such applications depend on high-performance computing (HPC), and a high reliability in the presence of transient faults is required as well. The individual cores may exhibit significant frequency variations (e.g., up to 30%) because of process variations (result from fabrication imprecision) (Bowman et al 2002; Dighe et al 2011) This situation is worsening with technology scaling because of the difficulty of precise fabrication with reduced dimensions at a nanoscale

Objectives
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call