Abstract
Missing data occurs when variables or observations are missing. Researchers exclude or impute influenced variables and data. This study proposes Fuzzy K-Top Matching Value (FKTM) for missing value imputation. It imputes missing numerical and categorical data with intelligent estimates based on similar records, decreasing bias. Expectation-maximization is used, where it employs fuzzy clustering to find a group of similar data and estimate them. We compare FKTM with original datasets on Immunotherapy and Cryotherapy. Multiple classification techniques are used on the imputed datasets. Random Forest achieved the best, with 93.3% for cryotherapy and 85.6% for Immunotherapy. The proposed approach is compared with Multivariate Imputation by Chained Equations (MICE) utilizing a Support Vector Machine. The proposed approach beats MICE with 82.2% accuracy. On the Cryotherapy dataset, the proposed approach surpasses existing strategies with 86.6% accuracy. Levene and Shapiro-Wilk were used to examine the homoscedasticity and normality of data after imputation. The proposed imputation procedure has no detrimental influence on the dataset. Finally, execution time and RMSE of imputed values are determined for three datasets with varied sample sizes and data dimensions. The proposed system exhibits a fast execution time and low RMSE. The proposed FKTM works well in experiments and looks promising.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of King Saud University - Computer and Information Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.