Abstract

Memory subsystems are a major energy bottleneck in computing platforms due to frequent transfers between processors and off-chip memory. We propose approximate memory compression, a technique that leverages the intrinsic resilience of emerging workloads such as machine learning and data analytics to reduce off-chip memory traffic and energy. To realize approximate memory compression, we enhance the memory controller to be aware of memory regions that contain approximation-resilient data, and to transparently compress/decompress the data written to/read from these regions. To provide control over approximations, the quality-aware memory controller conforms to a specified error constraint for each approximate memory region. We design a software interface that programmers can use to identify data structures that are resilient to approximations. We also propose a runtime quality control framework that automatically determines the error constraints for the identified data structures such that a given target application-level quality is maintained. We evaluate our proposal by implementing a hardware prototype using the Intel UniPHY-DDR3 memory controller and NIOS-II processor, a Hynix DDR3 DRAM module, and a Stratix-IV FPGA development board. Across a suite of 8 machine learning benchmarks, approximate memory compression obtains a 1.28× benefit in DRAM energy and a simultaneous 11.5% improvement in execution time for a small (< 1.5%) loss in output quality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call