Abstract

Classical techniques for clustering, such as k-means clustering, are very sensitive to the initial set of data centers, so it need to be rerun many times in order to obtain an optimal result. A relatively new clustering approach named Affinity Propagation (AP) has been devised to resolve these problems. Although AP seems to be very powerful it still has several issues that need to be improved. In this paper several improvement or development are discussed in , i.e. other four approaches: Adaptive Affinity Propagation, Partition Affinity Propagation, Soft Constraint Affinity propagation, and Fuzzy Statistic Affinity Propagation. and those approaches are be implemented and compared to look for the issues that AP really deal with and need to be improved. According to the testing results, Partition Affinity Propagation is the fastest one among four other approaches. On the other hand Adaptive Affinity Propagation is much more tolerant to errors, it can remove the oscillation when it occurs where the occupance of oscillation will bring the algorithm to fail to converge. Adaptive Affinity propagation is more stable than the other since it can deal with error which the other can not. And Fuzzy Statistic Affinity Propagation can produce smaller number of cluster compared to the other since it produces its own preferences using fuzzy iterative methods.

Highlights

  • Nowadays, the need of information in many aspects of life is really high

  • Partition affinity propagation passes messages in the subsets of data first and merges them as the number of initial step of iterations, it can effectively reduce the number of iterations of clustering

  • Agation based on fuzzy statistic and Affinity Propagation (AP). It simultaneously considers all data points in the feature space to be initial clustering exemplars and iteratively refines with the mean distance deviation until getting the optimal fuzzy statistical similarity matrix.The Algorithm is stated as follow

Read more

Summary

Introduction

The need of information in many aspects of life is really high. The information need to be delivered fast and accurate. It makes the extraction process of information from data is really crucial. There are many ways in mining an information from data, one of them is clustering. Clustering is commonly used to analyze data which is have very large or even huge data in numbers and the class label on data are unknown. Since assigning class labels to large number of data are very high cost process so another approach such as clustering is needed to mine useful information from data. The Clustering is focused on finding methods for efficient and effective cluster analysis in large databases [4]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.