Fast approximation of the top‐k items in data streams using FPGAs

Ali Ebrahim,Jalal Khalifat

doi:10.1049/cdt2.12053

Ali Ebrahim, Jalal Khalifat

Open Access

PDF Available

https://doi.org/10.1049/cdt2.12053

Copy DOI

Export

Save

Cite

Journal: IET Computers & Digital Techniques	Publication Date: Feb 19, 2023
Citations: 2	License type: CC BY 4.0

Affiliation: University of Bahrain

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Two methods are presented for finding the top-k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well-known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware-accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch-based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre-processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.

Full Text