In large-scale data analysis, efficient Top-K query processing is critical for numerous applications in science, industry, and society. Traditional approaches often involve substantial data transfer and computational overhead, making it difficult to meet the scalability and efficiency demands of modern datasets. This paper proposes a GPU-accelerated Top-K query processing method that integrates data compression and pre-filtering techniques to address these challenges. By partitioning and compressing data on the host side, it alleviates common PCIe bottlenecks in heterogeneous computing environments. A metadata-driven pre-filtering technique further reduces the data volume processed on the GPU, significantly improving query performance, particularly when handling anti-correlated datasets. Experimental results demonstrate that this method markedly reduces data transfer and processing time, confirming its effectiveness in enhancing the efficiency and scalability of Top-K query processing compared to existing methods.
Read full abstract