Abstract

This article design and implement a big data analysis and processing system based on a distributed platform, based on the Spark platform to process large-scale time series data. The system framework is mainly divided into storage layer, operator layer and algorithm layer. At the storage layer, the system organizes and indexes large-scale time series data based on HDFS and Hive. At the operator layer, the system provides users with basic operations commonly used in time series data on the Spark platform, and allows users to directly use these operators to implement custom time series related processing algorithms. At the algorithm layer, the system implements some commonly used time series analysis algorithms in the Spark platform, including time series similarity query, clustering, and forecasting. Users can directly use these algorithms for time series analysis. The feasibility and practicability of the system are verified by testing the system performance and function.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.