Evaluating and Mitigating Neutrons Effects on COTS EdgeAI Accelerators

Sebastian Blower,Maria Kastriotou,Christopher D Frost,Carlo Cazzaniga,Paolo Rech

doi:10.1109/tns.2021.3086686

Abstract

EdgeAI is an emerging artificial intelligence (AI) accelerator technology, which is capable of delivering improved AI performance at both a lower cost and a lower power level. With the aim of implementation in large quantities and in safety-critical environments, it is imperative to understand how single-event effects (SEEs) affect the reliability of this new family of devices and to propose efficient hardening solutions. Through neutron beam experiments and fault-injection analysis of a commercial-off-the-shelf (COTS) EdgeAI device, we are able to identify the device's SEE failure-modes, separate the error rate contributions of the device's different resources, and characterize the device's SEE reliability. During this analysis, we discovered that the vast majority of single-bit flips have no appreciable effect on the output. After this analysis, we propose a hardening solution that implements triple-modular redundancy (TMR) in the device without changing its physical architecture. We experimentally validate this solution and show that we are able to correct 96% of the misclassifications (critical errors) with nearly zero overhead.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating and Mitigating Neutrons Effects on COTS EdgeAI Accelerators

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Nuclear Science

Lead the way for us

Journal: IEEE Transactions on Nuclear Science	Publication Date: Aug 1, 2021
Citations: 14

Similar Papers

Single Event Effect Mitigation in ReConfigurable Computers for Space Applications
P.L Murray ... D Vanburen
-
P.L Murray, et. al.P.L Murray ... D Vanburen
01 Jan 2004
01 Jan 2004

Compiler extensions towards reliable multicore processors
Y Nezzari ... C P Bridges
-
Y Nezzari, et. al.Y Nezzari ... C P Bridges
01 Mar 2017
01 Mar 2017

DMT and DT2: Two Fault-Tolerant Architectures developed by CNES for COTs-based Spacecraft Supercomputers
M.P Pignol
-
M.P PignolM.P Pignol
10 Jul 2006
10 Jul 2006

New Protection Techniques Against SEUs for Moving Average Filters in a Radiation Environment
P Reyes ... P Reviriego
IEEE Transactions on Nuclear Science | VOL. 54
P Reyes, et. al.P Reyes ... P Reviriego
01 Aug 2007
IEEE Transactions on Nuclear Science | VOL. 54

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating and Mitigating Neutrons Effects on COTS EdgeAI Accelerators

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Nuclear Science