Abstract
We demonstrate the design of efficient and high-performance artificial intelligence (AI)/deep learning accelerators with customized spin transfer torque (STT)-MRAM (STT-MRAM) and a reconfigurable core. Based on model-driven detailed design space exploration, we present the design methodology of an innovative scratchpad-assisted on-chip STT-MRAM-based buffer system for high-performance accelerators. Using analytically derived expression of memory occupancy time of AI model weights and activation maps, the volatility of STT-MRAM is adjusted with process and temperature variation aware scaling of thermal stability factor to optimize the retention time, energy, read/write latency, and area of STT-MRAM. From the analysis of AI workloads and accelerator implementation in 14-nm technology, we verify the efficacy of our AI accelerator with STT-MRAM (STT-AI). Compared to an SRAM-based implementation, the STT-AI accelerator achieves 75% area and 3% power savings at isoaccuracy. Furthermore, with a relaxed bit error rate and negligible AI accuracy tradeoff, the designed STT-AI Ultra accelerator achieves 75.4% and 3.5% savings in area and power, respectively, over regular SRAM-based accelerators.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.