Abstract

This paper presents the Quick Error Detection (QED) technique for systematically creating families of post-silicon validation tests that quickly detect bugs inside processor cores and uncore components (cache controllers, memory controllers, and on-chip interconnection networks) of multicore system on chips (SoCs). Such quick detection is essential because long error detection latency, the time elapsed between the occurrence of an error due to a bug and its manifestation as an observable failure, severely limits the effectiveness of traditional post-silicon validation approaches. QED can be implemented completely in software, without any hardware modification. Hence, it is readily applicable to existing designs. Results using multiple hardware platforms, including the Intel® Core™ i7 SoC, and a state-of-the-art commercial multicore SoC, along with simulation results using an OpenSPARC T2-like multicore SoC with bug scenarios from commercial multicore SoCs demonstrate: 1) error detection latencies of post-silicon validation tests can be very long, up to billions of clock cycles, especially for bugs inside uncore components; 2) QED shortens error detection latencies by up to nine orders of magnitude to only a few hundred cycles for most bug scenarios; and 3) QED enables up to a fourfold increase in bug coverage.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call