As convolutional neural network CNN accelerators are being adopted in emerging safety-critical areas, their reliability becomes prominent. The systolic array is widely used as the major processing structure in CNN accelerators, so its reliability features need to be comprehensively explored. In this work, we develop a microarchitecture-level fault injection framework, saca-FI, to analyze the reliability of systolic array based CNN accelerators. It is able to investigate various vulnerability sources (e.g., single-bit flips, multi-bit flips, stuck-at faults, multiple faults) and at different levels (e.g., model layer and storage locations). Based on saca-FI, we evaluate the resilience of several commonly used networks, and observe several important architecture-level reliability characteristics. We then propose two opportunistic protections to demonstrate the usage of saca-FI in designing energy-efficient error protection mechanisms.
Read full abstract