Missing values or incomplete data are frequently encountered in medical records. These issues will be a serious problem if the data must be provided completely for analysis. The research aimed to prove the performance of the Fuzzy Subtractive Clustering (FSC) and Fuzzy C-Means (FCM) methods for solving imputation problems. Both methods were implemented using medical data. It had been conducted using K-Means as a crisp clustering approach for imputation. In the research, fuzzy clustering—a distinct methodology—was applied. The primary research contribution was the suggested fuzzy logic imputation method, which took uncertainty under consideration. The data sample consisted of patients who were at least 40 years old and had a history of hypertension, diabetes, heart disease, stroke, or chronic kidney disease. The test was carried out by taking random portions of data from the entire medical record. The randomization technique used a probability of 10%–50%. The results of the ANOVA test show that the p-value is greater than ∝(=0.05). It means that the imputed value does not differ from the original value, whether implemented in the FSC or FCM method. The algorithm’s performance is evaluated using the Pearson correlation coefficient. According to the t-test results, the FCM method has a higher correlation coefficient than the FSC method. It implies that FCM is superior to FSC.
Read full abstract