Abstract

Clustering is one of the most widely used pattern recognition technologies for data analytics. Density-based clustering is a category of clustering methods which can find arbitrary shaped clusters. A well-known density-based clustering algorithm is Density- Based Spatial Clustering of Applications with Noise (DBSCAN). DBSCAN has three drawbacks: firstly, the parameters for DBSCAN are hard to set; secondly, the number of clusters cannot be controlled by the users; and thirdly, DBSCAN cannot directly be used as a classifier. With addressing the drawbacks of DBSCAN, a novel framework, Evolutionary and Swarm Algorithm optimised Density-based Clustering and Classification (ESA-DCC), is proposed. Evolutionary and Swarm Algorithm (ESA), has been applied in various different research fields regarding optimisation problems, including data analytics. Numerous categories of ESAs have been proposed, such as, Genetic Algorithms (GAs), Particle Swarm Optimization (PSO), Differential Evaluation (DE) and Artificial Bee Colony (ABC). In this thesis, ESA is used to search the best parameters of density-based clustering and classification in the ESA-DCC framework to address the first drawback of DBSCAN. As method to offset the second drawback, four types of fitness functions are defined to enable users to set the number of clusters as input. A supervised fitness function is defined to use the ESA-DCC as a classifier to address the third drawback. Four ESA- DCC methods, GA-DCC, PSO-DCC, DE-DCC and ABC-DCC, are developed. The performance of the ESA-DCC methods is compared with K-means and DBSCAN using ten datasets. The experimental results indicate that the proposed ESA-DCC methods can find the optimised parameters in both supervised and unsupervised contexts. The proposed methods are applied in a product recommender system and image segmentation cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call