Packet Processing Architecture Using Last-Level-Cache Slices and Interleaved 3D-Stacked DRAM

Tomohiro Korikawa,Eiji Oki,Fujun He,Akio Kawabata

doi:10.1109/access.2020.2981540

Tomohiro Korikawa, Eiji Oki + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.2981540

Copy DOI

Abstract

Packet processing performance of Network Function Virtualization (NFV)-aware environment depends on the memory access performance of commercial-off-the-shelf (COTS) hardware systems. Table lookup is a typical example of packet processing, which has a significant dependence on memory access performance. Thus, the on-chip cache memories of the CPU are becoming more and more critical for many high-performance software routers or switches. Moreover, in the carrier network, multiple applications run on top of the same hardware system in parallel, which requires the capacity of cache memories. In this paper, we propose a packet processing architecture that enhances memory access parallelism by combining on-chip last-level-cache (LLC) slices and off-chip interleaved 3 Dimensional (3D)-stacked Dynamic Random Access Memory (DRAM) devices. Table entries are stored in the off-chip 3D-stacked DRAM, so that memory requests are processed in parallel by using bank interleaving and channel parallelism. Also, cached entries are distributed to on-chip LLC slices according to a memory address-based hash function so that each CPU core can access on-chip LLC in parallel. The evaluation results show that the proposed architecture reduces the memory access latency by 62 % and 12 % and increases the throughput by 108 % and 2 % with reducing blocking probability of memory requests 96 % and 50 %, compared to the architecture with on-chip shared LLC and that without on-chip LLC, respectively.

Highlights

Packet processing performance of Network Function Virtualization (NFV)-aware environment depends on the memory access performance of commercial-off-the-shelf (COTS) hardware systems
Table entries are stored in the off-chip 3 Dimensional (3D)-stacked Dynamic Random Access Memory (DRAM), so that memory requests are processed in parallel by using bank interleaving and channel parallelism
The system model consists of a CPU equipped with six CPU cores, each of which has a queue in front of it and has dedicated level 1 (L1) and level 2 (L2) cache, a shared LLC with a queue in front of it, a 3D-stacked DRAM and its controller

Summary

INTRODUCTION

Packet processing performance of Network Function Virtualization (NFV)-aware environment depends on the memory access performance of commercial-off-the-shelf (COTS) hardware systems. These network functions consist of several packet processing elements such as parsing, classification, editing, and metering, each of which requires table lookup and memory accesses When these network functions are running on the same hardware system, usually called the multi-tenant environment, multiple applications issues many memory accesses from each corresponding CPU core in parallel. This situation requires both speed and capacity of cache memories for high-performance packet processing. There is no work that evaluates the performance dependency of the proposed architecture on the number of VOLUME 8, 2020 assigned resources when combining the LLC slices with 3D-stacked DRAM. This paper proposes a packet processing architecture that enhances memory access parallelism by combining on-chip LLC slices and off-chip 3D-stacked DRAM devices.

BACKGROUND

TRAFFIC MODEL

BLOCKING PROBABILITY AND AVERAGE WAITING

NUMERICAL SIMULATION RESULTS

RELATED WORK

DISCUSSION

Findings

VIII. CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Packet Processing Architecture Using Last-Level-Cache Slices and Interleaved 3D-Stacked DRAM

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Packet Processing Architecture With Off-Chip LLC Using Interleaved 3D-Stacked DRAM
Tomohiro Korikawa ... Akio Kawabata
-
Tomohiro Korikawa, et. al.Tomohiro Korikawa ... Akio Kawabata
01 May 2019
01 May 2019

Memory Network Architecture for Packet Processing in Functions Virtualization
Tomohiro Korikawa ... Eiji Oki
-
Tomohiro Korikawa, et. al.Tomohiro Korikawa ... Eiji Oki
28 Jun 2021
28 Jun 2021

Carrier-Scale Packet Processing Architecture Using Interleaved 3D-Stacked DRAM and Its Analysis
Tomohiro Korikawa ... Akio Kawabata
IEEE Access | VOL. 7
Tomohiro Korikawa, et. al.Tomohiro Korikawa ... Akio Kawabata
01 Jan 2019
IEEE Access | VOL. 7

Carrier-Scale Packet Processing System Using Interleaved 3D-Stacked DRAM
Tomohiro Korikawa ... Fujun He
-
Tomohiro Korikawa, et. al.Tomohiro Korikawa ... Fujun He
01 May 2018
01 May 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Packet Processing Architecture Using Last-Level-Cache Slices and Interleaved 3D-Stacked DRAM

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access