Transient faults are rising as a crucial concern in the reliability of computer systems. As the emerging trend of integrating coprocessors into symmetric multiprocessors, it offers a better choice for software-oriented fault tolerance approaches. This paper presents coprocessor-based process level redundancy (PLR) which makes use of coprocessors and frees CPU cycle to other tasks. The experiment is conducted by comparing the performance of one CPU version of PLR and one coprocessor version PLR using a subset of optimised SPEC CPU2006 benchmark. It shows that the proposed approach enhances performance by 32.6% on average. The performance can be enhanced more if one application contains more system calls. This common technique can be adapted to other software-based fault tolerance as well.
Read full abstract