Abstract

Clustering is one of the most significant applications in the big data field. However, using the clustering technique with big data requires an ample amount of processing power and resources due to the complexity and resulting increment in the clustering time. Therefore, many techniques have been implemented to improve the performance of the clustering algorithms, especially for k-means clustering. In this paper, the neural-processor-based k-means clustering technique is proposed to cluster big data by accumulating the advantage of dedicated machine learning processors of mobile devices. The solution was designed to be run with a single-instruction machine processor that exists in the mobile device’s processor. Running the k-means clustering in a distributed scheme run based on mobile machine learning efficiently can handle the big data clustering over the network. The results showed that using a neural engine processor on a mobile smartphone device can maximize the speed of the clustering algorithm, which shows an improvement in the performance of the cluttering up to two-times faster compared with traditional laptop/desktop processors. Furthermore, the number of iterations that are required to obtain (k) clusters was improved up to two-times faster than parallel and distributed k-means.

Highlights

  • Thousands of clustering algorithms have been published based on this concept, and k-means is one of the most used. k-means is widely used with a wide range of applications due to its simplicity of implementation and its effectiveness

  • This paper proposes an efficient and high-performance solution to improve the kmeans clustering by: 1. Maximizing the performance of the k-means algorithm by running it on the dedicated neural engine processor of smart mobile devices by editing the code and steps of the kmeans algorithm to run on the single-instruction-based machine with an ARM-based processor; 2

  • 4 9 12 22 In Table 5, several iterations are fixed in the case of the parallel neural k-means algorithm using the education sector dataset, i.e., for k = 4, 5, 6, 7, but this kept changing from one run to another in the case of the parallel k-means clustering algorithm with multiple running times

Read more

Summary

Introduction

We are in a data flood era, as proven by the massive amounts of continuously generated data at unprecedented and ever-increasing scales. Machine learning techniques have become increasingly popular in a wide range of large and complex data-intensive applications, such as astronomy, as well as medicine, biology, and other sciences [1]. These strategies offer potential options for extracting hidden information from the data.

Big Data Clustering
Multi-Machine Clustering
Big Data Platform
Related Work
Proposed Solution
Proposed Solution Processing
Complexity
Analysis of the Experiment Results
Neural Engine Performance
Number of Iterations
Results
Multiple Cores and Multiple Processors
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.