Abstract

This article design and implement a big data analysis and processing system based on a distributed platform, based on the Spark platform to process large-scale time series data. The system framework is mainly divided into storage layer, operator layer and algorithm layer. At the storage layer, the system organizes and indexes large-scale time series data based on HDFS and Hive. At the operator layer, the system provides users with basic operations commonly used in time series data on the Spark platform, and allows users to directly use these operators to implement custom time series related processing algorithms. At the algorithm layer, the system implements some commonly used time series analysis algorithms in the Spark platform, including time series similarity query, clustering, and forecasting. Users can directly use these algorithms for time series analysis. The feasibility and practicability of the system are verified by testing the system performance and function.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call