Abstract

ABSTRACT Cluster analysis is the most popular and often the foremost task in big data analytics as it helps in unearthing hidden patterns and trends in data. Traditional single-objective clustering techniques often suffer from accuracy fluctuations especially when applied over data groups of varying densities and imbalanced distribution as well as in the presence of outliers. This paper presents a multi-phase clustering solution that achieves good accuracy measures even in the case of noisy and not- well-separated data (linearly not separable data). The proposed design combines a two-stage Particle Swarm Optimisation (PSO) clustering with K-means logic and a state-of-the-art outlier removal technique. The use of two different optimisation criteria in the two stages of PSO clustering equips the model with the ability to escape local minima traps in the process of convergence. Extensive experiments featuring a wide variety of data have been carried out and the system could achieve accuracy levels as high as 99.9% and an average of 87.4% on notwell-separated data. The model has also been proved to be robust on eight out of the ten datasets of the Fundamental Clustering Problem Suit (FCPS), a benchmark for clustering algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.