Deep neural networks (DNNs) have gained tremendous popularity in recent years due to their ability to achieve superhuman accuracy in a wide variety of machine learning tasks. However, the compute and memory requirements of DNNs have grown rapidly, creating a need for energy-efficient hardware. Resistive crossbars have attracted significant interest in the design of the next generation of DNN accelerators due to their ability to natively execute massively parallel vector-matrix multiplications within dense memory arrays. However, crossbar-based computations face a major challenge due to device and circuit-level nonidealities, which manifest as errors in the vector-matrix multiplications and eventually degrade DNN accuracy. To address this challenge, there is a need for tools that can model the functional impact of nonidealities on DNN training and inference. Existing efforts toward this goal are either limited to inference or are too slow to be used for large-scale DNN training. We propose TxSim, a fast and customizable modeling framework to functionally evaluate DNN training on crossbar-based hardware considering the impact of nonidealities. The key features of TxSim that differentiate it from prior efforts are: 1) it comprehensively models nonidealities during all training operations (forward propagation, backward propagation, and weight update) and 2) it achieves computational efficiency by mapping crossbar evaluations to well-optimized Basic Linear Algebra Subprograms (BLAS) routines and incorporates speedup techniques to further reduce simulation time with minimal impact on accuracy. TxSim achieves 6×- 108× improvement in simulation speed over prior works, and thereby makes it feasible to evaluate the training of large-scale DNNs on crossbars. Our experiments using TxSim reveal that the accuracy degradation in DNN training due to nonidealities can be substantial (3%-36.4%) for large-scale DNNs and data sets, underscoring the need for further research in mitigation techniques. We also analyze the impact of various device and circuit-level parameters and the associated nonidealities to provide key insights that can guide the design of crossbar-based DNN training accelerators.
Read full abstract