Abstract

Currently, thanks to the rapid development of wireless sensor networks and network traffic monitoring, the data stream is gradually becoming one of the most popular data generating processes. The data stream is different from traditional static data. Cluster analysis is an important technology for data mining, which is why many researchers pay attention to grouping streaming data. In the literature, there are many data stream clustering techniques, unfortunately, very few of them try to solve the problem of clustering data streams coming from multiple sources. In this article, we present an algorithm with a tree structure for grouping data streams (in the form of a time series) that have similar properties and behaviors. We have evaluated our algorithm over real multivariate data streams generated by smart meter sensors—the Irish Commission for Energy Regulation data set. There were several measures used to analyze the various characteristics of a tree-like clustering structure (computer science perspective) and also measures that are important from a business standpoint. The proposed method was able to cluster the flows of data and has identified the customers with similar behavior during the analyzed period.

Highlights

  • The term data stream refers to a potentially unwieldy, continuous, and rapid sequence of information

  • We have proposed an approach to grouping multiple data stream time series in order to:

  • For the parameter set to 5% (50 time series) and window length w set to 30 days (1440 records)

Read more

Summary

Introduction

The term data stream refers to a potentially unwieldy, continuous, and rapid sequence of information. Unlike traditional data forms, which are invariable and static, the data stream has its own unique features, such as (1) it consists of a continuous flow of very large data; (2) it is rapidly evolving data that occurs in real-time with quick response requirements; (3) multiple access to the data stream is almost impossible; and (4) storage of the data stream is restricted. Data streams occur in many real-world scenarios. They are generated from sensors, network traffic, satellites, and other interesting use cases. They must be processed quickly and draw as much knowledge as possible. Data streams have their own specificity of data processing and exploration

Objectives
Methods
Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.