A Hybrid Method for Incomplete Data Imputation

Liang Zhao,Zhikui Chen,Yueming Hu,Zhennan Yang

doi:10.1109/hpcc-css-icess.2015.103

Abstract

With the explosive increase of data volume, the research of data quality and data usability draws extensive attention. In this work, we focus on one aspect of data usability -- incomplete data imputation, and present a novel missing value imputation method using stacked auto-encoder and incremental clustering (SAICI). Specifically, SAICI's functionality rests on four pillars: (i) a distinctive value assigned to impute missing values initially, (ii) the stacked auto-encoder(SAE) applied to locate principal features, (iii) a new incremental clustering utilized to partition incomplete data set, and (iv) the top nearest neighbors' weighted values designed to refill the missing values. Most importantly, stages (ii)~(iv) iterate until convergence condition is satisfied. Experimental results demonstrate that the proposed scheme not only imputes the missing data values effectively, but also has better time performance. Moreover, this work is suitable for distributed data processing framework, which can be applied to the imputation of incomplete big data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Hybrid Method for Incomplete Data Imputation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

What is missing from my missing data plan?
Sharon D Yeatts ... Renée H Martin
Stroke | VOL. 46
Sharon D Yeatts, et. al.Sharon D Yeatts ... Renée H Martin
07 May 2015
Stroke | VOL. 46

Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud
Fanyu Bu ... Qingchen Zhang
The Journal of Supercomputing | VOL. 72
Fanyu Bu, et. al.Fanyu Bu ... Qingchen Zhang
06 May 2015
The Journal of Supercomputing | VOL. 72

Missing Data: Its Emergence in the Real-world-A Practical Review on Google Play Apps dataset using Python
Vikalp Kumar Tripathi ... Pratyush Parashar
-
Vikalp Kumar Tripathi, et. al.Vikalp Kumar Tripathi ... Pratyush Parashar
20 May 2022
20 May 2022

A Comprehensive Survey on Imputation of Missing Data in Internet of Things
Deepak Adhikari ... Hadi A Khorshidi
ACM Computing Surveys | VOL. 55
Deepak Adhikari, et. al.Deepak Adhikari ... Hadi A Khorshidi
15 Dec 2022
ACM Computing Surveys | VOL. 55

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Method for Incomplete Data Imputation

Abstract

Talk to us

Similar Papers