EvolveCluster: an evolutionary clustering algorithm for streaming data

Christian Nordahl,Marie Persson Netz,Veselka Boeva,Håkan Grahn

doi:10.1007/s12530-021-09408-y

Christian Nordahl, Marie Persson Netz + Show 2 more

Open Access

https://doi.org/10.1007/s12530-021-09408-y

Copy DOI

Journal: Evolving Systems	Publication Date: Nov 13, 2021
Citations: 7	License type: open-access

Affiliation: Blekinge Institute of Technology

Abstract

Data has become an integral part of our society in the past years, arriving faster and in larger quantities than before. Traditional clustering algorithms rely on the availability of entire datasets to model them correctly and efficiently. Such requirements are not possible in the data stream clustering scenario, where data arrives and needs to be analyzed continuously. This paper proposes a novel evolutionary clustering algorithm, entitled EvolveCluster, capable of modeling evolving data streams. We compare EvolveCluster against two other evolutionary clustering algorithms, PivotBiCluster and Split-Merge Evolutionary Clustering, by conducting experiments on three different datasets. Furthermore, we perform additional experiments on EvolveCluster to further evaluate its capabilities on clustering evolving data streams. Our results show that EvolveCluster manages to capture evolving data stream behaviors and adapts accordingly.

Highlights

In recent years, data has become an integral part of our daily lives
1 https://github.com/christiannordahl/EvolveCluster the EvolveCluster and Split-Merge Clustering algorithms are more proficient than PivotBiCluster in identifying new clusters when they arrive in the data stream
It is interesting to notice that Split-Merge Clustering fully follows the true clustering of the data points, up to the point that even data points that are overlapping into another cluster is correctly classified

Summary

Introduction

Data has become an integral part of our daily lives. Due to advances in hardware infrastructures, there are endless possibilities available to collect any type of data at a rapid pace. (Bifet et al 2010b) These data streams are endless information sources that arrive in a timely fashion. The results from the unsupervised learning algorithms can be used directly for analysis or as an intermediary step to gain an understanding of the data. One of the branches of unsupervised learning is the task of clustering analysis. Clustering algorithms are designed to identify an underlying structure of data and use the detected relationships within the structure to group the data points into distinct groups. These algorithms usually decide upon themselves how to divide the data into subgroups, an unsupervised approach to increase knowledge about the data. This study focuses on partitioning algorithms due to the proposed evolutionary clustering algorithms characteristics (see Sect. 4)

Objectives

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

EvolveCluster: an evolutionary clustering algorithm for streaming data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Evolving Systems

Lead the way for us

Similar Papers

Evolutionary multi-objective clustering for overlapping clusters detection
Kazi Shah Nawaz Ripon ... M N H Siddique
-
Kazi Shah Nawaz Ripon, et. al.Kazi Shah Nawaz Ripon ... M N H Siddique
01 May 2009
01 May 2009

Evolutionary Clustering Algorithm with Knowledge-Based Evaluation for Fuzzy Cluster Analysis of Gene Expression Profiles
Han-Saem Park ... Sung-Bae Cho
-
Han-Saem Park, et. al.Han-Saem Park ... Sung-Bae Cho
01 Jan 2004
01 Jan 2004

Evolutionary Fuzzy Clustering: An Overview and Efficiency Issues
D Horta ... E R Hruschka
-
D Horta, et. al.D Horta ... E R Hruschka
01 Jan 2009
01 Jan 2009

Load Data Analysis Based on Timestamp-Based Self-Adaptive Evolutionary Clustering
Rongheng Lin ... Budan Wu
IEEE Transactions on Industrial Informatics | VOL. 19
Rongheng Lin, et. al.Rongheng Lin ... Budan Wu
01 Dec 2023
IEEE Transactions on Industrial Informatics | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

EvolveCluster: an evolutionary clustering algorithm for streaming data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Evolving Systems