Abstract

In diverse areas of human endeavor such as business, industry, sciences and so on, massive amount of time series data are generated daily and due to the fact that time series data are typically very large, discovering information from such massive datasets therefore becomes a major challenge. A number of algorithms have been introduced to represent, classify, cluster, segment, index, detect motifs and anomalies in a time series data. In view of the above, this paper proposes a robust algorithm for pattern recognition and representation of a time series. The algorithm first normalises a time series dataset into the range [0,1]. The normalized version is now used for pattern identification and representation. In the proposed algorithm, we pre-defined patterns as up, down and flat patterns, and having equal length (three, five or ten data points). Each pattern represents a segment (subsequence) of the time series. The algorithm was tested with historical time series datasets obtained online from (a) Dow Jones Industrial Average (b) Nasdaq, and (c) S&P 500 via yahoo finance. Each dataset consisted of 5158 data points, covering the period 2000-2020. The algorithm captured all the pre-defined patterns in the datasets and was able to represent the patterns in the entire historical datasets with symbols. The algorithm is a veritable tool for time series data mining operations. Object-Oriented Analysis and Design Methodology (OOADM) and prototyping methodology were used to design the system; while PHP, MYSQL, HTML and CSS were used to develop the system. The system was well tested and the outputs were excellent.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call