Abstract
In this paper, we address the scalability problem of periodicity detection for time series and sequence databases. We present time and space efficient periodicity detection method that efficiently uses external memory (disk) when the series cannot be processed inside the available main memory. Our approach uses suffix tree to facilitate periodicity detection. We consider two cases, namely in-core and out of core. First, we optimize storage requirements of the suffix tree to be able to fit larger suffix trees in-core. This guarantees the ability to mine larger sequences when everything can be kept in-core, which is what the current periodicity detection approaches are able to mine. Second, when the data structures go out of core, we extend the suffix tree construction part to use external memory. We are able to achieve this while maintaining linear time complexity. As a result, when we go out of core, we can mine databases that require considerably larger space to keep the data structures compared to the available main memory. For the out-of-core periodicity detection part, the proposed method allows the required data structures to be kept on external memory partially when a memory overflow situation occurs. Various pruning strategies are also proposed to allow the proposed approach to process large sequences within reasonable amount of time. Additionally, we introduced the notion of “emulated tree traversal” for fast suffix tree traversal. Due to all these special considerations, we are able to mine much larger sequences compared to other existing periodicity detection algorithms. To demonstrate the applicability, power, and effectiveness of the proposed framework, we present results of periodicity detection up to 500 MB of time sequence data, which (to the best of our knowledge) is the largest reported sequence mined for periodicity detection ever.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.