Abstract

In sub-20 nm technologies, DRAM cells suffer from poor retention time. With the technology scaling, this problem tends to be worse, significantly increasing refresh power of DRAM. This is more problematic in memory heavy applications such as deep learning systems, where a large amount of DRAM is required, DRAM refresh power contributes to a considerable portion of total system power. With the growth in deep learning workloads, this is set to get worse. In this work, we present a zero-cycle bit-masking (ZEM) scheme that exploits the asymmetry of retention failures, to eliminate DRAM refresh in the inference of convolution neural networks, natural language processing, and the image generation based on generative adversarial network. Through careful analysis, we derive a bit-error-rate (BER) threshold that does not affect the accuracy of inference. Our proposed architecture, along with the techniques involved, are applicable to all types of DRAMs. Our results on 16Gb devices show that ZEM can improve the performance by up to 17.31% while reducing the total energy consumed by DRAM by up to 43.03%, dependent on the type of DRAM.

Highlights

  • D EEP learning has made quantum leap in many application domains such as computer vision, speech processing, language translation, and content generation

  • For LPDDR, the performance of zero-cycle bit-masking (ZEM)+ECC is almost similar to non-ECC ZEM, because the on-die ECC is integrated in DRAM without change in timing [45]

  • We have presented the encoding scheme that enables refresh-less DRAM for high-performance and energyefficient deep learning system called ZEM

Read more

Summary

INTRODUCTION

D EEP learning has made quantum leap in many application domains such as computer vision, speech processing, language translation, and content generation. BASIC IDEA In this work, we assume that all deep learning data such as weight, bias, and activation are stored in the refreshless approximate DRAM while other critical information is reserved in the precise DRAM, normally refreshed. This can be implemented based on the architecture of [16], [17]. We propose the zerocycle encoding scheme that placed in memory controller as shown in Fig. 7 to make deep learning parameters more resilient to random bit error, can enable the refresh-less approximate DRAM for deep learning applications, named ZEM. For GAN, we used three models from [28]: Horse2Zebra, VangoghStyle, MonetStyle which transform images of horse to zebra, and real photos to drawings of Van Gogh and Monet styles

EXTENDING THE BER OF DEEP LEARNING
EVALUATION
ENERGY RESULTS
COMPARED TO STATE-OF-THE-ART WORKS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call