Abstract

The recently proposed learned bloom filter (LBF) opens a new perspective on how to reconstruct bloom filters with machine learning. However, the LBF has a massive time cost and does not apply to multidimensional spatial data. In this paper, we propose a prefix-based and adaptive learned bloom filter (PA-LBF) for spatial data, which efficiently supports the insertion and deletion. The proposed PA-LBF is divided into three parts: (1) the prefix-based classification. The Z-order space-filling curve is used to extract data, prefix it, and classify it. (2) The adaptive learning process. The multiple independent adaptive sub-LBFs are designed to train the suffixes of data, combined with part 1, to reduce the false positive rate (FPR), query, and learning process time consumption. (3) The backup filter uses CBF. Two kinds of backup CBF are constructed to meet the situation of different insertion and deletion frequencies. Experimental results prove the validity of the theory and show that the PA-LBF reduces the FPR by 84.87%, 79.53%, and 43.01% with the same memory usage compared with the LBF on three real-world spatial datasets. Moreover, the time consumption of PA-LBF can be reduced to 5 × and 2.05 × that of the LBF on the query and learning process, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call