Efficiency Improvement of Classification Model Based on Altered K-Means Using PCA and Outlier

Se-Hoon Jung,Jong-Chan Kim

doi:10.1142/s0218194019400047

Abstract

In the generation and analysis of Big Data following the development of various information devices, the old data processing and management techniques reveal their hardware and software limitations. Their hardware limitations can be overcome by the CPU and GPU advancements, but their software limitations depend on the advancement of hardware. This study thus sets out to address the increasing analysis costs of dense Big Data from a software perspective instead of depending on hardware. An altered [Formula: see text]-means algorithm was proposed with ideal points to address the analysis costs issue of dense Big Data. The proposed algorithm would find an optimal cluster by applying Principal Component Analysis (PCA) in the multi-dimensional structure of dense Big Data and categorize data with the predicted ideal points as the central points of initial clusters. Its clustering validity index and [Formula: see text]-measure results were compared with those of existing algorithms to check its excellence, and it had similar results to them. It was also compared and assessed with some data classification techniques investigated in previous studies and we found that it made a performance improvement of about 3–6% in the analysis costs.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficiency Improvement of Classification Model Based on Altered K-Means Using PCA and Outlier

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering

Lead the way for us

Journal: International Journal of Software Engineering and Knowledge Engineering	Publication Date: May 1, 2019
Citations: 6

Similar Papers

Advanced Big Data Management and Analytics for Ubiquitous Sensors
Praveen Rao ... Sangjun Lee
International Journal of Distributed Sensor Networks | VOL. 11
Praveen Rao, et. al.Praveen Rao ... Sangjun Lee
01 Jul 2015
International Journal of Distributed Sensor Networks | VOL. 11

Big Data and Large Scale Methods In Cloud Computing
Khair Unnisa Begum
International Journal of Engineering and Computer Science | VOL. -
Khair Unnisa BegumKhair Unnisa Begum
09 Apr 2016
International Journal of Engineering and Computer Science | VOL. -

Big data processing and analysis platform for condition monitoring of electric power system
Yuanjun Guo ... Yuquan Liu
-
Yuanjun Guo, et. al.Yuanjun Guo ... Yuquan Liu
01 Aug 2016
01 Aug 2016

A New Fuzzy Cluster Validity Index for Hyperellipsoid or Hyperspherical Shape Close Clusters With Distant Centroids
Himanshu Mittal ... Mukesh Saraswat
IEEE Transactions on Fuzzy Systems | VOL. 29
Himanshu Mittal, et. al.Himanshu Mittal ... Mukesh Saraswat
01 Nov 2021
IEEE Transactions on Fuzzy Systems | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficiency Improvement of Classification Model Based on Altered K-Means Using PCA and Outlier

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering