Abstract

Static off-line scheduling ensures predictability of worst-case behavior and high resource utilization for safety-critical applications but lacks the flexibility needed to deal with run-time fault-tolerance. We present a temporal redundancy-based recovery technique that tolerates transient task failures in statically scheduled distributed embedded systems where tasks have timing, resource, and precedence constraints. Task failures are handled using precomputed contingency schedules that introduce adaptive fault tolerance into table-driven dispatchers. Failures are masked using the spare capacity on the affected processor and the recovery scheme requires no hardware overhead. Our approach combines the benefits of static scheduling with the run-time flexibility needed for fault tolerance in low-cost embedded systems. We present a method to obtain contingency schedules and prove its correctness. We also evaluate the effectiveness of the proposed method through simulation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call