A Comparative Study on Data Stream Clustering Algorithms

Twinkle Keshvani,Madhu Shukla

doi:10.1007/978-3-030-24643-3_27

Abstract

Data stream mining is trending today due to enormous data generation by many applications. Most of the non-stationary systems generate data which are massive in volume, non-static and fast changing. Data available is huge, so to augment the calibre of data, it is essential to cluster them. This in turn will boost the data processing speed and that holds a great level of importance in data stream mining. Data Streams being enormous, complicated, fast changing and infinite puts some additional challenges in clustering techniques i.e. time limitation, memory constraint. For mining data streams, many clustering algorithm has been emerged. Also, it is also required to identify cluster of arbitrary shape. And Density based clustering algorithm play important role there. These algorithms fall under notable class having potential to find clusters of arbitrary shapes and to detect noise. Over and above, these algorithms do not require information of the sum total of clusters to be formed, as a part of prior knowledge. The primary focus of these algorithms is to use density-based methods while forming clusters and simultaneously curbing the constraints, which are inherent in the nature of data streams. The objective of this paper is to throw discuss algorithms that are density based and their pros and cons.

Full Text