Abstract

Social media data, e.g. Tweets, are usually geo-tagged, embedded with creation or posting time, and associated with texts. Nowadays, there is an increasing need for querying such spatio-temporal-text data. In this work, we propose a new type of query, top-k spatio-temporal keyword query (k-STKQ in short), over Twitter-like social media data. A k-STKQ takes a location, a timestamp and a set of keywords as argument, and returns top-k objects that are near the location, close to the timestamp, and relevant to the set of keywords. An example of k-STKQ is to search the tweets mentioning “garage sale” recently sent from some places nearby. The massive amount and dynamic nature of social media data are the primary obstacles towards efficient processing of k-STKQs. In order to return the answers efficiently, we propose a novel index, TiST, for the processing of k-STKQs. TiST partitions the incoming data into subsets, and builds an R-tree index on each subset. The timestamps and texts of the objects are also integrated with the R-trees. To further boost the indexing performance, we propose a routing R-tree based R-tree insertion method, which is inspired by the observation that many sets of objects are similar in their locations. For the texts of objects, we propose a hybrid bitmap-based index, which is space-saving and supports relevance computation. The query processing algorithm is also presented based on the TiST index. We conduct extensive experiments to demonstrate that our solution is capable of providing excellent indexing performance and good query performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.