Abstract

Continuous k nearest neighbor queries over spatial–textual data streams (abbreviated as CkQST) are the core operations of numerous location-based publish/subscribe systems. Such a system is usually subscribed with millions of CkQST and evaluated simultaneously whenever new objects arrive and old objects expire. To efficiently evaluate CkQST, we extend a quadtree with an ordered, inverted index as the spatial–textual index for subscribed queries to match the incoming objects, and exploit it with three key techniques. (1) A memory-based cost model is proposed to find the optimal quadtree nodes covering the spatial search range of CkQST, which minimize the cost for searching and updating the index. (2) An adaptive block-based ordered, inverted index is proposed to organize the keywords of CkQST, which adaptively arranges queries in spatial nodes and allows the objects containing common keywords to be processed in a batch with a shared scan, and hence a significant performance gain. (3) A cost-based k-skyband technique is proposed to judiciously determine an optimal search range for CkQST according to the workload of objects, to reduce the re-evaluation cost due to the expiration of objects. The experiments on real-world and synthetic datasets demonstrate that our proposed techniques can efficiently evaluate CkQST.

Highlights

  • The continuous k nearest neighbor queries over spatial–textual data streams retrieve to and continuously monitor at most k nearest neighbor objects at the user-specified location containing all the user-specified keywords, which have been widely used in a variety of location-based applications, such as location-aware targeting of advertisements, analysis of micro-blogs, and mobile navigation-services.In an e-coupon recommendation system or a Weibo publish/subscribe system, users register his/her interests as a query

  • (2) An adaptive block-based ordered, inverted index is proposed to organize the keywords of CkQST, which adaptively arranges queries in spatial nodes and allows the objects containing common keywords to be processed in a batch with a shared scan, and a significant performance gain

  • Continuous queries over spatial–textual data streams studied by existing work [1,2,3,4,5,6,7,8,9,10,11,12]

Read more

Summary

Introduction

The continuous k nearest neighbor queries over spatial–textual data streams (abbreviated asCkQST) retrieve to and continuously monitor at most k nearest neighbor (abbreviated as kNN) objects at the user-specified location containing all the user-specified keywords, which have been widely used in a variety of location-based applications, such as location-aware targeting of advertisements, analysis of micro-blogs, and mobile navigation-services.In an e-coupon recommendation system or a Weibo publish/subscribe system, users register his/her interests (e.g., favorite food or clothing brand for the former, and news or persons for the latter) as a query. The continuous k nearest neighbor queries over spatial–textual data streams Continuous queries over spatial–textual data streams studied by existing work [1,2,3,4,5,6,7,8,9,10,11,12]. The number of qualified objects containing all keywords specified by a user can be far larger than k, because the objects (e.g., tweets, news) usually contain much more keywords than queries do. This motivates us to study CkQST, which return at most k nearest neighbor objects containing all the query keywords

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.