Missing Value Imputation via Clusterwise Linear Regression

Napsu Karmitsa,Pauliina Makinen,Sona Taheri,Adil Bagirov

doi:10.1109/tkde.2020.3001694

Abstract

In this paper a new method of preprocessing incomplete data is introduced. The method is based on clusterwise linear regression and it combines two well-known approaches for missing value imputation: linear regression and clustering. The idea is to approximate missing values using only those data points that are somewhat similar to the incomplete data point. A similar idea is used also in clustering based imputation methods. Nevertheless, here the linear regression approach is used within each cluster to accurately predict the missing values, and this is done simultaneously to clustering. The proposed method is tested using some synthetic and real-world data sets and compared with other algorithms for missing value imputations. Numerical results demonstrate that the proposed method produces the most accurate imputations in MCAR and MAR data sets with a clear structure and the percentages of missing data no more than 25%

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Missing Value Imputation via Clusterwise Linear Regression

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering

Lead the way for us

Journal: IEEE Transactions on Knowledge and Data Engineering	Publication Date: Jan 1, 2020
Citations: 26

Similar Papers

Editor's evaluation: Robust and Efficient Assessment of Potency (REAP) as a quantitative tool for dose-response curve estimation
Philip Boonstra
-
Philip BoonstraPhilip Boonstra
09 May 2022
09 May 2022

What is missing from my missing data plan?
Sharon D Yeatts ... Renée H Martin
Stroke | VOL. 46
Sharon D Yeatts, et. al.Sharon D Yeatts ... Renée H Martin
07 May 2015
Stroke | VOL. 46

Lognormal Fitting of Particle Size Distribution Data Monitored in Animal Buildings: Linear versus Nonlinear Regression Approach
X Yang ... D E Barker
Transactions of the ASABE | VOL. 55
X Yang, et. al. X Yang ... D E Barker
01 Jan 2012
Transactions of the ASABE | VOL. 55

Subdimension-based similarity measure for DNA microarray data clustering
Benson S Y Lam ... Hong Yan
Physical Review E | VOL. 74
Benson S Y Lam, et. al.Benson S Y Lam ... Hong Yan
09 Oct 2006
Physical Review E | VOL. 74

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Missing Value Imputation via Clusterwise Linear Regression

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering