Fast Approximation of the Top-k Items in Data Streams Using a Reconfigurable Accelerator

Ali Ebrahim,Jalal Khalifat

doi:10.1007/978-3-030-79025-7_1

Abstract

AbstractThis paper presents a novel method for finding the top-k items in data streams using a reconfigurable accelerator. The accelerator is capable of extracting an approximate list of the topmost frequently occurring items in an input stream, which is only scanned once without the need for random-access. The accelerator is based on a hardware architecture that implements the well-known Probabilistic sampling algorithm by mapping its main processing stages to two custom systolic arrays. The proposed architecture is the first hardware implementation of this algorithm, which shows better scalability compared to other architectures that are based on other stream algorithms. When implemented on an Intel Arria 10 FPGA (10AX115N2F45E1SG), 50% of the FPGA chip is sufficient for 3000+ Processing Elements (PEs). Experimental results on both synthetic and real input datasets showed very good accuracy and significant throughput gains compared to existing solutions. With achieved throughputs exceeding 300 Million items/s, we report average speedups of 20x compared to typical software implementations, 1.5x compared to GPU-accelerated implementations, and 1.8x compared to the fastest FPGA implementation.KeywordsData streamProbabilistic samplingTop-k itemsFPGA

Full Text