Abstract

Recently, dramatic improvements in memory performance have been highly required for data demanding application services such as deep learning, big data, and immersive videos. To this end, the throughput-oriented memory such as high bandwidth memory (HBM) and hybrid memory cube (HMC) has been introduced to provide a high bandwidth. For its effective use, various research efforts have been conducted. Among them, the near-memory-processing (NMP) is a concept that utilizes bandwidth and power consumption by placing computation logic near the memory. In the NMP-enabled system, a processor hierarchy consisting of hosts and NMPs is formed based on the distance from the main memory. In this paper, an evaluation tool is proposed to obtain the optimal design decision considering the power-time trade-off in the processor hierarchy. Every time the operating condition and constraints change, the decision of task-level offloading is dynamically made. For the realistic NMP-enabled system environment, the relationship among HBM, host, and NMP should be carefully considered. Hosts and NMPs are almost hidden from each other and the communications between them are extremely limited. In the simulation results, popular benchmarks and a machine learning application are used to demonstrate power-time trade-offs depending on applications and system conditions.

Highlights

  • Efforts have been conducted to increase both the processor speed and memory size.the memory bottleneck problem has become increasingly serious and is a critical issue to overcome urgently to improve overall system performance

  • Data-intensive applications such as deep learning, big data, and immersive video have attracted attention, and a significant improvement in memory performance is in high demand

  • A through-silicon via (TSV)-based stacked DRAM memory such as the high bandwidth memory (HBM) [1] or hybrid memory cube (HMC) [2] has been introduced to provide a high bandwidth with a wide I/O. This next-generation memory has a structure in which multiple layers of the DRAM die are stacked on a base logic layer, and interlayer communication is achieved through high-speed TSV technology

Read more

Summary

Introduction

Efforts have been conducted to increase both the processor speed and memory size.the memory bottleneck problem has become increasingly serious and is a critical issue to overcome urgently to improve overall system performance. Data-intensive applications such as deep learning, big data, and immersive video have attracted attention, and a significant improvement in memory performance is in high demand. A through-silicon via (TSV)-based stacked DRAM memory such as the high bandwidth memory (HBM) [1] or hybrid memory cube (HMC) [2] has been introduced to provide a high bandwidth with a wide I/O. This next-generation memory has a structure in which multiple layers of the DRAM die are stacked on a base logic layer, and interlayer communication is achieved through high-speed TSV technology. It provides a high bandwidth with low power consumption

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call