In sensitive database applications (e.g., time series, scientific databases, and biometric), where database is encrypted and outsourced to a public cloud, secure approximate k-nearest neighbor (SANN) query is a fundamental research topic, aiming at retrieving high-dimensional objects that are similar to a given query from encrypted database. To process such queries without ever decrypting the data in cloud is still a challenging task. Existing works encounter various inherent limitations, such as query distinguishability, low-level efficiency and non-recoverability. All of them lead to either fragile security or low accuracy. Hence, the majority of existing works in this field are impractical for industrial applications.In this work, we present a novel model to remove the above limitations. Specifically, a reusable and single-interactive SANN paradigm is proposed in Euclidean high-dimensional space. Firstly, we present a secure variation of B+-tree (i.e., Bc-tree) to quickly locate high-dimensional candidates in cloud by leveraging on comparable encryption. Based on that, an arbitrary query requestor acquires approximate k-nearest neighbors by linearly scanning over candidates. Meanwhile, two refinements, multi-index strategy and boosting refinement strategy, are proposed to further improve the accuracy of search result and overcome the high-dependency of bandwidth, respectively. In the end, through extensive evaluations on four data sets, the proposed mechanisms are demonstrated to be superior in the tradeoff between accuracy and security.
Read full abstract