Abstract

The presence of missing data in a dataset plays a vital role in the design of classification, clustering, or regression methods. An efficient missing data imputation can enhance the overall performance of a machine learning method. This paper ensembles k-nearest neighbour imputation, local least square imputation, miss forest imputation, and k-means clustering imputation using the bagging approach to handle missing values over a wide range of datasets. The method has been tested with eight different datasets in terms of root mean square error, median absolute percentage error, mean absolute percentage error, and standard deviation. Experimental results show that our method gives a low error rate compared to its closed competitors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call