Social networks such as Twitter provide thousands of terabytes per day, which can be exploited to find relevant information. This relevant information is used to promote marketing strategies, analyze current political issues, and track market trends, to name a few examples. One instance of relevant information is finding cyclic behavior patterns (i.e., patterns that frequently repeat themselves over time) in the population. Because trending topics on Twitter change rapidly, efficient algorithms are required, especially when considering location and time (i.e., the specific location and time) during broadcasts. This article presents an efficient algorithm based on association rules to find cyclical patterns on Twitter, considering the inherent spatio-temporal attributes of data. Using a Hash Table enhances the efficiency of this algorithm, called HashCycle. Notably, HashCycle does not use minimum support and can detect patterns in a single run over a sequence. The processing times of HashCycle were compared to the Apriori (which is a well-known and widely used on diverse platforms) and Projection-based Partial Periodic Patterns (PPA) algorithms (which is one of the most efficient algorithms in terms of processing times). Empirical results from two spatio-temporal databases (a synthetic data set and one based on Twitter) show that HashCycle has more efficient processing times than two state-of-the-art algorithms: Apriori and PPA.
Read full abstract