The advent of in-memory computing has introduced a new paradigm of computation, which offers significant improvements in terms of latency and power consumption for emerging embedded AI accelerators. Nevertheless, the effect of the hardware variations and non-idealities of the emerging memory technologies may significantly compromise the accuracy of inferred neural networks and result in malfunctions in safety-critical applications. This article addresses the issue from three different perspectives. First, we describe the technology-related sources of these variations. Then, we propose an architectural-level mitigation strategy that involves the coordinated action of two checksum codes designed to detect and correct errors at runtime. Finally, we optimize the area and latency overhead of the proposed solution by using two accuracy-aware hardware-software co-design techniques. The results demonstrate higher efficiency in mitigating the accuracy degradation of multiple AI algorithms in the context of different technologies compared with state-of-the-art solutions and traditional techniques such as triple modular redundancy. Several configurations of our implementation recover more than 95% of the original accuracy with less than 40% of the area and less than 30% of latency overhead.This article is part of the themed issue 'Emerging technologies for future secure computing platforms'.
Read full abstract