Fast Longest Prefix Matching by Exploiting SIMD Instructions

Yukito Ueno,Ryo Nakamura,Yohei Kuga,Hiroshi Esaki

doi:10.1109/access.2020.3023156

Yukito Ueno, Ryo Nakamura + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.3023156

Copy DOI

Abstract

Longest prefix matching (LPM) is a fundamental process in IP routing used not only in traditional hardware routers but also in software middleboxes. However, the performance of LPM in software is still insufficient for processing packets at over 100 Gbps, although previous studies have tackled this issue by exploiting the CPU cache or accelerators such as GPUs. To improve the performance of software LPM further, we propose a novel LPM method called Spider, which exploits a single-instruction multiple-data (SIMD) mechanism in the CPU. Spider achieves performing LPM for up to 16 destination IP address in parallel by a routing table structure carefully designed for processing by the SIMD instructions. We evaluated Spider from the following three perspectives: the improvement of LPM performance derived from the parallelism provided by the SIMD mechanism, performance comparison with other methods, and performance scalability. The evaluation shows that Spider dramatically improves the LPM performance, which reaches 1.8-3.2 times compared with the state-of-the-art methods. Moreover, Spider achieves 5,074 million lookups per second with 16 CPU cores, which is equivalent to the processing capacity of 3.4 Tbps in short packets; the performance opens up the possibility of packet processing at the terabit-class rate by software.

Highlights

Longest prefix matching (LPM) is a fundamental process of IP routing in both hardware routers and software middleboxes
Software LPM cannot deliver as much performance as hardware, software middleboxes are actively used for various use cases, e.g., network function virtualization (NFV) [1], [2], software routers for backbone networks [3], and software-defined WAN [4]
In this paper, we have proposed Spider, which achieves an improvement of the LPM performance by parallelizing its lookup procedure in a single CPU core

Summary

INTRODUCTION

Longest prefix matching (LPM) is a fundamental process of IP routing in both hardware routers and software middleboxes. A major approach for fast LPM in software is to shorten the time for looking up a destination IP address by leveraging the CPU cache to minimize the latency for accessing data [5]–[12] Their performance has not reached the speed of a multiple of the 100 Gbps interfaces yet, their further performance improvement would be limited because they have thoroughly exploited the CPU cache, and so their remaining improvement factor is the increase of the CPU frequency, which has stagnated [13]. The evaluation is extended from our previous work to reveal more detailed characteristics of Spider, including the applicability to real-world packet processing applications (§ V-C), the CPU cycles to process the lookup procedure (§ V-D), and the performance under different CPU frequencies (§ V-E). The evaluation shows that Spider achieves major improvement (1.8–2.6 times for IPv4, and 2.2–3.2 times for IPv6) compared with the state-of-the-art methods and delivers the processing capacity of 34 ports of 100 Gbps interface with 16 CPU cores. The performance improvement of Spider opens up the possibility of packet processing at the terabit-class rate by software

RELATED WORK

APPLICABILITY TO REAL PACKET PROCESSING

EVALUATION

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Fast Longest Prefix Matching by Exploiting SIMD Instructions

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Spider: Parallelizing Longest Prefix Matching with Optimization for SIMD Instructions
Yukito Ueno ... Ryo Nakamura
-
Yukito Ueno, et. al.Yukito Ueno ... Ryo Nakamura
01 Jun 2020
01 Jun 2020

Efficient SIMD optimization for media processors
Jian-Peng Zhou ... Ce Shi
Journal of Zhejiang University-SCIENCE A | VOL. 9
Jian-Peng Zhou, et. al.Jian-Peng Zhou ... Ce Shi
01 Apr 2008
Journal of Zhejiang University-SCIENCE A | VOL. 9

Compiler optimizations for processors with SIMD instructions
Ivan Pryanishnikov ... Nigel Horspool
Software: Practice and Experience | VOL. 37
Ivan Pryanishnikov, et. al.Ivan Pryanishnikov ... Nigel Horspool
12 Sep 2006
Software: Practice and Experience | VOL. 37

Using GPU and SIMD Implementations to Improve Performance of Robotic Emotional Processes
Abel Martinez ... Carlos Dominguez
-
Abel Martinez, et. al.Abel Martinez ... Carlos Dominguez
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Longest Prefix Matching by Exploiting SIMD Instructions

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access