Abstract
In-memory computing (IMC) processors show significant energy and area efficiency for deep neural network (DNN) processing [1–3]. As shown in Fig. 16.5.1, despite promising macro-level efficiency and throughput, there remain three main challenges to extending gains to system performance with a high integration level. First, most previous works had a fixed configuration and fixed size of IMC macros, and when the size of macro was smaller than the DNN layer's dimension, repetitive memory accesses were required for IA/OA, consuming >40% of IMC power. In the opposite case, macros experience underutilization. Second, previous eDRAM-based [4–6] IMCs showed even lower cell density than SRAM-based IMCs [1–3], owing to the area needed to realize a large cell capacitor for long retention time. Third, previous IMC processors [1], [2], [5] employed bit truncation to mitigate the ADC area overhead, incurring large quantization noise. Consequently, an IMC processor capable of adapting and optimizing its macro architecture for varying tasks with a high-density cell is required for higher system-level efficiency and throughput in DNN processing.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.