A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing Architectures

Kamil Khan,Sudeep Pasricha,Ryan Gary Kim

doi:10.3390/jlpea10040030

Kamil Khan, Sudeep Pasricha + Show 1 more

Open Access

https://doi.org/10.3390/jlpea10040030

Copy DOI

Abstract

Due to the amount of data involved in emerging deep learning and big data applications, operations related to data movement have quickly become a bottleneck. Data-centric computing (DCC), as enabled by processing-in-memory (PIM) and near-memory processing (NMP) paradigms, aims to accelerate these types of applications by moving the computation closer to the data. Over the past few years, researchers have proposed various memory architectures that enable DCC systems, such as logic layers in 3D-stacked memories or charge-sharing-based bitwise operations in dynamic random-access memory (DRAM). However, application-specific memory access patterns, power and thermal concerns, memory technology limitations, and inconsistent performance gains complicate the offloading of computation in DCC systems. Therefore, designing intelligent resource management techniques for computation offloading is vital for leveraging the potential offered by this new paradigm. In this article, we survey the major trends in managing PIM and NMP-based DCC systems and provide a review of the landscape of resource management techniques employed by system designers for such systems. Additionally, we discuss the future challenges and opportunities in DCC management.

Highlights

For the past few decades, memory performance improvements have lagged behind compute performance improvements, creating an increasing mismatch between the time to transfer data and the time to perform computations on these data
In order to properly manage what is offloaded onto PIM or near-memory processing (NMP) systems, where it is offloaded, and when it is offloaded, prior work has utilized one of three different strategies: (1) code annotation: techniques that rely on the programmers to select and determine the appropriate sections of code to offload; (2) compiler optimization: techniques that attempt to automatically identify what to offload during compile-time; (3) online heuristics: techniques that use a set of rules to determine what to offload during run-time
There are several challenges and opportunities for resource management of PIM/NMP substrates related to generalizability, multi-objective considerations, reliability, and the application of more intelligent techniques, e.g., machine learning (ML), as discussed below:

Summary

Introduction

For the past few decades, memory performance improvements have lagged behind compute performance improvements, creating an increasing mismatch between the time to transfer data and the time to perform computations on these data (the “memory wall”). It is evident that the large latencies and energies involved with moving data to the processor will present an overwhelming bottleneck in future systems To address this issue, researchers have proposed to reduce these costly data movements by introducing data-centric computing (DCC), where some of the computations are moved in proximity to the memory architecture. Better performance for genomic applications and 10× better energy consumption in an Intel x86 server compared to an Intel x86 server without UPMEM [4] Both PIM and NMP systems have the potential to speed up application execution by reducing data movements. We survey the landscape of different resource management techniques that decide which computations are offloaded onto the PIM/NMP systems.

Prior Surveys and Scope

Data-Centric Computing Architectures

PIM Using DRAM

PIM Using NVM

Near‐Memory

PE Types

Memory Types

Resource

Optimization Objectives

Performance

Energy Efficiency

Power and Thermal Efficiency

Optimization Knobs

Identification of Offloading Workloads

Selection of Memory PE

Timing of Offloads

Management Techniques

Code Annotation Approaches

Compiler-Based Approaches

10. HMC atomic atomic instructions instructions in in HMC

11. CAIRO’s

Online Heuristic

Findings

Conclusions and Future

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Low Power Electronics and Applications	Publication Date: Sep 24, 2020
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing Architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Low Power Electronics and Applications

Lead the way for us

Similar Papers

A unified hybrid memory system for scalable deep learning and big data applications
Wei Rang ... Dazhao Cheng
Journal of Parallel and Distributed Computing | VOL. 186
Wei Rang, et. al.Wei Rang ... Dazhao Cheng
28 Dec 2023
Journal of Parallel and Distributed Computing | VOL. 186

Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System
Juan Gomez-Luna ... Christina Giannoula
IEEE Access | VOL. 10
Juan Gomez-Luna, et. al.Juan Gomez-Luna ... Christina Giannoula
01 Jan 2021
IEEE Access | VOL. 10

CGAcc: A Compressed Sparse Row Representation-Based BFS Graph Traversal Accelerator on Hybrid Memory Cube
Cheng Qian ... Zhiying Wang
Electronics | VOL. 7
Cheng Qian, et. al.Cheng Qian ... Zhiying Wang
07 Nov 2018
Electronics | VOL. 7

Power and Thermal Modeling of In-3D-Memory Computing
Jun-Han Han ... Samira Khan
-
Jun-Han Han, et. al.Jun-Han Han ... Samira Khan
03 Mar 2021
03 Mar 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing Architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Low Power Electronics and Applications