The collective spatial keyword query (CoSKQ), an important variant of spatial keyword query, aims to find a set of objects collectively covering the user’s query keywords, that are close to the query location and are close to each other. However, existing works only focus on the CoSKQ problem of exact keyword matching and cannot handle spelling errors and conventional spelling differences (for example, colour vs. colour), that are common in real applications. Moreover, query time information is not considered. To this end, this paper takes the lead in studying the problem of Time-aware Approximate Collective spatial Keyword query processing in traffic networks (TACoSKQ), where the objects are located on a predefined traffic network. We first prove that the TACoSKQ problem is NP-complete, and design a hybrid index called TDAG-tree to support query-object distance pruning, inter-object distance pruning, approximate keyword pruning, and temporal pruning simultaneously. Then, we present two approximate algorithms with provable approximation bounds to efficiently support TACoSKQ query processing on traffic networks. Finally, extensive experiments using three real datasets demonstrate the efficiency and accuracy of our proposed algorithms.
Read full abstract