Missing Value Imputation Based on Data Clustering

Shichao Zhang,Yongsong Qin,Xiaofeng Zhu,Jilian Zhang,Chengqi Zhang

doi:10.1007/978-3-540-79299-4_7

Missing Value Imputation Based on Data Clustering

Shichao Zhang, Yongsong Qin + Show 3 more

Open Access

https://doi.org/10.1007/978-3-540-79299-4_7

Copy DOI

Publication Date: Jan 1, 2008

Citations: 88

Affiliation: Guangxi Normal University, Singapore Management University, University of Technology Sydney

#Missing Value Imputation #Missing Values + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clustering-based Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to the instance A using a kernel-based method. Specifically, we first divide the dataset (including the instances with missing values) into clusters. Next, missing values of an instance A are patched up with the plausible values generated from A's cluster. Extensive experiments show the effectiveness of the proposed method in missing value imputation task.

Full Text