Abstract

Modern computing applications based upon machine learning can incur significant data movement overheads in state-of-the-art computers. Resistive-memory-based <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">processing-using-memory</i> (PUM) can mitigate this data movement by instead performing computation <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">in situ</i> (i.e., directly within memory cells), but device-level limitations restrict the practicality and/or performance of many PUM architecture proposals. The RACER architecture overcomes these limitations, by proposing efficient peripheral circuitry and the concept of bit-pipelining to enable high-performance, high-efficiency computation using small memory tiles. In this work, we extend RACER to adapt easily to different PUM logic families, by (1) modifying the device access circuitry to support a wide range of logic families, (2) evaluating three logic families proposed by prior work, and (3) proposing and evaluating a new logic family called OSCAR that significantly relaxes the switching voltage constraints required to perform logic with resistive memory devices. We show that the modified RACER architecture, using the OSCAR logic family, can enable practical PUM on real ReRAM devices while improving performance and energy savings by 30% and 37%, respectively, over the original RACER work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.