Abstract
Though ReRAM has been greatly successful in reducing energy consumption of various neural networks, it still suffers write amplification in energy, which impedes ReRAM to provide efficient storage for the ubiquitous streaming data in CNNs, such as feature-maps. Racetrack memory, an emerging magnetic memory technique, is a proper candidate to hold streaming data since it enjoys fast sequential-access with ultra-low operating energy in read and write. In this work, we propose a hybrid processing-in-memory architecture, called MemUnison, that coordinates ReRAM and racetrack to overcome the expenditure storage of streaming data in ReRAM. By placing feature-maps in racetrack and leaving weights in ReRAM, a datapath is constructed between the two sides to form a fetch-process-writeback pipeline. As the invalid-shifts of the racetrack memory incurs a large amount of pipeline bubble, we propose a row-based access that can read and write a feature-map without any invalid-shifts. For the row-based operation, a cohesive controlling method is proposed to coordinate racetrack and ReRAM. In runtime, convolution kernels are scheduled in ReRAM banks for cross-channel calculations of one row, by which computing complexity of a convolutional layer can be reduced by 4 orders of magnitude, excessing the 2 order of reduction by traditional ReRAM.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.