Abstract

As a basic method of spatial data operation, spatial keyword query can provide meaningful information to meet user demands by searching spatial textual datasets. How to accurately understand users’ intentions and efficiently retrieve results from spatial textual big data are always the focus of research. Spatial textual big data and their complex correlation between textual features not only enrich the connotation of spatial objects but also bring difficulties to the efficient recognition and retrieval of similar spatial objects. Because there are a lot of many-to-many relationships between massive spatial objects and textual features, most of the existing research results that employ tree-like and table-like structures to index spatial data and textual data are inefficient in retrieving similar spatial objects. In this paper, firstly, we define spatial textual concept (STC) as a group of spatial objects with the same textual keywords in a limited spatial region in order to present the many-to-many relationships between spatial objects and textual features. Then we attempt to introduce the concept lattice model to maintain a group of related STCs and propose a hybrid tree-like spatial index structure, the lattice-tree, for spatial textual big data. Lattice-tree employs R-tree to index the spatial location of objects, and it embeds a concept lattice structure into specific tree nodes to organize the STC set from a large number of textual keywords of objects and their relationships. Based on this, we also propose a novel spatial keyword query, named Top-k spatial concept query (TkSCQ), to answer STC and retrieve similar spatial objects with multiple textual features. The empirical study is carried out on two spatial textual big data sets from Yelp and Amap. Experiments on the lattice-tree verify its feasibility and demonstrate that it is efficient to embed the concept lattice structure into tree nodes of 3 to 5 levels. Experiments on TkSCQ evaluate lattice from results, keywords, data volume, and so on, and two baseline index structures based on IR-tree and Fp-tree, named the inverted-tree and Fpindex-tree, are developed to compare with the lattice-tree on data sets from Yelp and Amap. Experimental results demonstrate that the Lattice-tree has the better retrieval efficiency in most cases, especially in the case of large amounts of data queries, where the retrieval performance of the lattice-tree is much better than the inverted-tree and Fpindex-tree.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call