Abstract

Ant colony optimization (ACO) is one of robust algorithms for solving optimization problems, including clustering. However, high and redundant computation is needed to select the proper cluster for each object, especially when the data dimensionality is high, such as social media comments. Reducing the redundant computation may cut the execution time, but it can potentially decrease the quality of clustering. With the basic idea that nearby objects tend to be in the same cluster, the nearest neighbors method can be used to choose the appropriate cluster for some objects efficiently by considering their neighbor’s cluster. Therefore, this paper proposes the combination of nearest neighbors and ant colony optimization for clustering (NNACOC) which can reduce the computation time but is still able to retain the quality of clustering. To evaluate its performance, NNACOC was tested using some benchmark datasets and twitter comments. Most of the experiments show that NNACOC outperformed the original ant colony optimization for clustering (ACOC) in quality and execution time. NNACOC also yielded a better result than k-means when clustering the twitter comments.

Highlights

  • Nowadays, clustering plays an important role in many applications, such as business intelligence and analytic [1], public health and security [2], as well as the energy saving of internet of things [3], [4]

  • This paper proposes neighbors and ant colony optimization for clustering (NNACOC), the hybrid of nearest neighbors and ant colony optimization for clustering (ACOC) algorithm which is more efficient than ACOC but still able to retain the clustering quality

  • Based on the evaluation result of NNACOC algorithm, it can be concluded that the nearest neighbors algorithm can be used to improve the ant colony optimization (ACO) based algorithm for clustering, especially when clustering the large datasets

Read more

Summary

Introduction

Nowadays, clustering plays an important role in many applications, such as business intelligence and analytic [1], public health and security [2], as well as the energy saving of internet of things [3], [4]. Clustering has been implemented in many cases of text mining. According to [5], with the rapid growth of social media usage, petabytes of data had been generated; most of them are in the form of text, blogs, Twitter comments, Facebook feeds, chats, e-mails, and reviews. Clustering the social media comments has drawn many interests from government to businesses for reading people’s opinions quickly and accurately. In the study by [8], most of clustering methods can be considered as optimization problems for finding the most optimal data partitioning based on the objective function. One of the most popular clustering algorithms is k-means which was introduced in 1955 and is still widely used until now [8]. Some researchers proposed metaheuristic approach for solving clustering problems such as artificial bee colony (ABC) [11], [12] and ant colony optimization (ACO) [13]–[21]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call