Research on Data filling Algorithm Based on Improved k-means and Information Entropy

Xiaofei Gong,Yijie Shi,Jie Zhang

doi:10.1109/compcomm.2018.8781052

Abstract

Due to human error, equipment failure and other factors, the industrial Internet platform may generate a part of missing data. In order to fill in the missing data, this paper adopts a data filling method based on improved k-means and information entropy. First, we use the mean or mode to pre-fill the missing data. Then, we change the Euclidean distance in the k-means cluster to the Mahalanobis distance to cluster the data; and within the same category, calculate the similarity between each missing data and all complete data. Finally, Combined with the KNN idea, we find the k complete data that are most similar to each missing data, use information entropy to calculate the weight coefficients of the k complete data, and weight the corresponding attributes of the complete data to fill in the missing attributes. Experimental results show that the data filling algorithm in this paper has better filling precision than k-means and KNN algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Research on Data filling Algorithm Based on Improved k-means and Information Entropy

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

What is missing from my missing data plan?
Sharon D Yeatts ... Renée H Martin
Stroke | VOL. 46
Sharon D Yeatts, et. al.Sharon D Yeatts ... Renée H Martin
07 May 2015
Stroke | VOL. 46

How an industrial internet platform empowers the digital transformation of SMEs: theoretical mechanism and business model
Honglei Li ... Ziyu Yang
Journal of Knowledge Management | VOL. 27
Honglei Li, et. al.Honglei Li ... Ziyu Yang
14 Dec 2022
Journal of Knowledge Management | VOL. 27

Missing Data: Its Emergence in the Real-world-A Practical Review on Google Play Apps dataset using Python
Vikalp Kumar Tripathi ... Pratyush Parashar
-
Vikalp Kumar Tripathi, et. al.Vikalp Kumar Tripathi ... Pratyush Parashar
20 May 2022
20 May 2022

A comprehensive industrial practice for Industrial Internet Platform (IIP): General model, reference architecture, and industrial verification
Xianyu Zhang ... Xinguo Ming
Computers & Industrial Engineering | VOL. 158
Xianyu Zhang, et. al.Xianyu Zhang ... Xinguo Ming
28 May 2021
Computers & Industrial Engineering | VOL. 158

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on Data filling Algorithm Based on Improved k-means and Information Entropy

Abstract

Talk to us

Similar Papers