Abstract

As a critical research topic toward the new era of big data, how to develop a high-performance data analytics system has received significant research attention from different disciplines since the 2000s. In the literature, many recent works attempted to develop a high-performance data analytics system to handle the large amount of data (i.e., volume) from different information systems (i.e., variety) that typically will be created very quickly in a short time (i.e., velocity). In particular, several recent studies have shown that metaheuristic algorithms can be applied to many data mining optimization problems to provide a better way to find a high-quality result than traditional deterministic algorithms. A high-performance clustering algorithm for big data analytics system will be presented in this paper. The proposed algorithm is designed based on a new kind of metaheuristic algorithm, coral reef optimization with substrate layers (CRO-SL), to get a better cluster result. To improve the effectiveness and efficiency, the proposed CRO-SL scheme has been applied to a cloud computing platform as well to reduce the response time of a data analytics system. The simulation results show that the proposed algorithm is able to provide a better clustering result than the other clustering algorithms compared in this research, including k-means, genetic k-means algorithm, particle swarm optimization, and simple coral reef optimization algorithm in terms of the sum of squared errors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call