Abstract

We present a novel parallel IP address lookup architecture based on graphics processing unit (GPU) via compute unified device architecture (CUDA). Our architecture consists of two functions: 1) host function and 2) device function. The host function is executed by a CPU to construct and update the data structure of IP address lookup executed by the device function in a GPU. Both host and device functions are executed simultaneously to fully utilize computational resources. To shorten the lookup time, a trie-based data structure optimized for CUDA is developed. The trie-based data structure uses multi-bit stride to shorten the trie depth and also improves the efficiency of texture cache in GPUs. The experimental results show that a low-end G92 GPU can achieve a throughput of more than 1.3 billion packets per second for IPv4 routing tables with more than 350K prefixes while a high-end GT200 GPU can further double the performance. By employing dual data structures, the implementation can support several hundred thousand updates per second.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call