Enhancing the performance of the aggregated bit vector algorithm in network packet classification using GPU.

Mahdi Abbasi,Razieh Tahouri,Milad Rafiee

doi:10.7717/peerj-cs.185

Mahdi Abbasi, Razieh Tahouri + Show 1 more

Open Access

https://doi.org/10.7717/peerj-cs.185

Copy DOI

Abstract

Packet classification is a computationally intensive, highly parallelizable task in many advanced network systems like high-speed routers and firewalls that enable different functionalities through discriminating incoming traffic. Recently, graphics processing units (GPUs) have been exploited as efficient accelerators for parallel implementation of software classifiers. The aggregated bit vector is a highly parallelizable packet classification algorithm. In this work, first we present a parallel kernel for running this algorithm on GPUs. Next, we adapt an asymptotic analysis method which predicts any empirical result of the proposed kernel. Experimental results not only confirm the efficiency of the proposed parallel kernel but also reveal the accuracy of the analysis method in predicting important trends in experimental results.

Highlights

The considerable evolution in the speed of internet communications makes the gap between communication speed and processing speed ever wider
Review of the related literature shows that none of the studies have parallelized aggregated bit vector (ABV) algorithm on graphics processing units (GPUs)-like many-core machines
Nvidia supplied a software platform called compute unified device architecture (CUDA) for performing nongraphic computations on graphic processors in 2006 (Nakano, 2013a) CUDA provides possibilities that could be used by programmers to have access to hardware capabilities of graphic processors in their nongraphic programs and increase the speed of performing complicated algorithms

Summary

INTRODUCTION

The considerable evolution in the speed of internet communications makes the gap between communication speed and processing speed ever wider. Hardware methods have achieved the highest speeds in classification by utilizing parallel lookups on ternary content addressable memories (Sun et al, 2017) Problems such as considerable prices of these hardware modules, their high consumption power, the inflexibility of their architecture to any variation in the filters, and inefficiency. Review of the related literature shows that none of the studies have parallelized aggregated bit vector (ABV) algorithm on GPU-like many-core machines. This algorithm is a decomposition-based algorithm and has an appropriate structure that lets it be highly parallelized on GPU systems (Baboescu & Varghese, 2001). Conclusions and directions for future research are discussed in the last section

BACKGROUND

RELATED WORK

CONCLUSION