A fuzzy incremental clustering approach to hybrid data discovery

Radu D Găceanu,Horia F Pop

doi:10.2478/v10198-012-0010-x

Abstract

We propose an incremental fuzzy clustering algorithm for hybrid data discovery. The algorithm is based on the ASM model where data items are represented by agents placed in a two dimensional grid. The agents will group themselves into clusters by making simple moves in their environment. They will try to get closer to each other if they are rather similar or to get away from each other if they are rather different. The algorithm allocates a new agent on the grid whenever a new data item arrives. At each step the new agent contacts an agent from the grid and if they are similar then they will group together in the same cluster. Whenever a new cluster is created the agents will try to merge the cluster with one of the previously created clusters. If a newly created agent does not find a similar fellow then it will start an ASM-like process in order to search for one and thus the data is clustered. Several clustering algorithms exist each with its own strengths and weaknesses. Some algorithms need an ini- tial estimation of the number of clusters (k-means, fuzzy c-means); others could often be too slow (agglomerative hi- erarchical clustering algorithms). Ant-based clustering al- gorithms often require hybridization with a classical clus- tering algorithm such as k-means. In (2) an ant-based clustering algorithm is presented. It is based on the ASM (Ants Sleeping Model) approach. In ASM, an ant has two states on a two-dimensional grid: ac- tive state and sleeping state. When the artificial ant's fitness is low, it has a higher probability to wake up and stay in active state. It will thus leave its original position to search for a more secure and comfortable position to sleep. When an ant locates a comfortable and secure position, it has a higher probability to sleep unless the surrounding environ- ment becomes less hospitable and activates it again. In (3) a Stigmergic Agent System (SAS) combining the strengths of Ant Colony Systems and Multi-Agent Systems concepts is proposed. The agents from the SAS are using both direct and indirect communication. By using direct communication the risk of getting trapped in local optima is lower. However, as showed in (16), most ant-based al- gorithms can be used only in a first phase of the clustering process because of the high number of clusters that are usu- ally produced. In a second phase a k-means-like algorithm is often used. In (16), an algorithm in which the behaviour of the arti- ficial ants is governed by fuzzy IF-THEN rules is presented. Like all ant-based clustering algorithms, no initial partition- ing of the data is needed, nor should the number of clus- ters be known in advance. The ants are capable to make their own decisions about picking up items. Hence the two phases of the classical ant-based clustering algorithm are merged into one, and k-means becomes superfluous.

Highlights

Several clustering algorithms exist each with its own strengths and weaknesses
Incremental clustering is used to process sequential, continuous data flows or data streams and in situations in which cluster shapes change over time
They are well fitted in real-time systems, wireless sensor networks or data streams because in such systems it is difficult to store the datasets in memory

Summary

INTRODUCTION

Several clustering algorithms exist each with its own strengths and weaknesses. Some algorithms need an initial estimation of the number of clusters (k-means, fuzzy c-means); others could often be too slow (agglomerative hierarchical clustering algorithms). In [2] an ant-based clustering algorithm is presented It is based on the ASM (Ants Sleeping Model) approach. The agents are able to detect changes in the environment and adjust their moves The advantage of this approach is that it enables the ants to communicate directly like in [3] breaking the neighbourhood boundaries and decreasing the chance of ants to get trapped in local minima. In order to solve the clustering problem we propose an incremental algorithm based on ASM (Ants Sleeping Model) [2, 6]. Incremental clustering is used to process sequential, continuous data flows or data streams and in situations in which cluster shapes change over time They are well fitted in real-time systems, wireless sensor networks or data streams because in such systems it is difficult to store the datasets in memory. The advantages and drawbacks of the approach together with some concluding remarks are presented in the closing Section 7

MOTIVATION

RELATED WORK

THEORETICAL BACKGROUND

INCREMENTAL FUZZY CLUSTERING

Formal aspects

Our approach

EXPERIMENTS

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Acta Electrotechnica et Informatica	Publication Date: Jan 1, 2012
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

A fuzzy incremental clustering approach to hybrid data discovery

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Electrotechnica et Informatica

Lead the way for us

Similar Papers

A new Clustering algorithm for an Energy Efficient WSN that Monitors HVTTs: Modified Multi Clustering
Abdullah Kurtoglu
-
Abdullah KurtogluAbdullah Kurtoglu
01 Jul 2018
01 Jul 2018

Adaptive spectral affinity propagation clustering
Lin Tang ... Chonghui Guo
Journal of Systems Engineering and Electronics | VOL. 33
Lin Tang, et. al.Lin Tang ... Chonghui Guo
01 Jun 2022
Journal of Systems Engineering and Electronics | VOL. 33

A Co-Evolutionary Multi-Objective approach for a K-adaptive graph-based clustering algorithm
David F Barrero ... David Camacho
-
David F Barrero, et. al.David F Barrero ... David Camacho
01 Jul 2014
01 Jul 2014

A fuzzy clustering based method for effect evaluation of operational simulation training
Ge Li ... Peng Wang
-
Ge Li, et. al.Ge Li ... Peng Wang
01 May 2017
01 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A fuzzy incremental clustering approach to hybrid data discovery

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Acta Electrotechnica et Informatica