Abstract

In the generation and analysis of Big Data following the development of various information devices, the old data processing and management techniques reveal their hardware and software limitations. Their hardware limitations can be overcome by the CPU and GPU advancements, but their software limitations depend on the advancement of hardware. This study thus sets out to address the increasing analysis costs of dense Big Data from a software perspective instead of depending on hardware. An altered [Formula: see text]-means algorithm was proposed with ideal points to address the analysis costs issue of dense Big Data. The proposed algorithm would find an optimal cluster by applying Principal Component Analysis (PCA) in the multi-dimensional structure of dense Big Data and categorize data with the predicted ideal points as the central points of initial clusters. Its clustering validity index and [Formula: see text]-measure results were compared with those of existing algorithms to check its excellence, and it had similar results to them. It was also compared and assessed with some data classification techniques investigated in previous studies and we found that it made a performance improvement of about 3–6% in the analysis costs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call