Abstract

Electronic health records (EHR) are a major source of information in biomedical informatics. Yet, missing values are prominent characteristics of EHR. Prediction on dataset with missing values results in inaccurate inferences. Nearest neighbour imputation based on lazy learning approach is a proven technique for missing data imputation and is recognized as one among the top ten data mining algorithms due to its simplicity and understandability. But its performance is deteriorated due to the curse of dimensionality as unimportant features are likely to dominate. We address this problem by proposing a novel approach for feature weighting based on a hybrid of metaheuristic whale optimization algorithm (WOA) and local search late acceptance hill climbing algorithm (LAHCA) on nearest neighbour imputation method. Our proposed approach Metaheuristic and Local Search based Feature Weighted Nearest Neighbour Imputation (kNN+LAHCAWOA) also learns different k values for different test points. Our approach is tested on benchmark EHR datasets with three proven classifiers Support Vector Machines(SVM), Random forest(RF) and Deep neural networks(DNN). The results prove that kNN+LAHCAWOA is an effective imputation strategy and aids in improving the classification performance when compared with its competitor methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.