Machine learning inference is of broad interest, increasingly in energy-constrained applications. However, platforms are often pushed to their energy limits, especially with deep learning models, which provide state-of-the-art inference performance but are also computationally intensive. This has motivated algorithmic co-design, where flexibility in the model and model parameters, derived from training, is exploited for hardware energy efficiency. This work extends a model-training algorithm referred to as Stochastic Data-Driven Hardware Resilience (S-DDHR) to enable statistical models of computations, amenable for energy/throughput aggressive hardware operating points as well as emerging variation-prone device technologies. S-DDHR itself extends the previous approach of DDHR by incorporating the statistical distribution of hardware variations for model-parameter learning, rather than a sample of the distributions. This is critical to developing accurate and composable abstractions of computations, to enable scalable hardware-generalized training, rather than hardware instanceby-instance training. S-DDHR is demonstrated and evaluated for a bit-scalable MRAM-based in-memory computing architecture, whose energy/throughput trade-offs explicitly motivate statistical computations. Using foundry data to model MRAM device variations, S-DDHR is shown to preserve high inference performance for benchmark datasets (MNIST, CIFAR-10, SVHN) as variation parameters are scaled to high levels, exhibiting less than 3.5% accuracy drop at 10× the nominal variation level.
Read full abstract