Medical data science advances using machine learning, which predicts glucose levels. A supervised machine learning technique is employed in which regression and classification methods are used to check the prediction performance. The unsupervised machine learning technique makes clusters based on variables' similarities. Furthermore, the prediction accuracy of conventional machine learning techniques is improved by proposing a transfer learning technique. Based on a median value of 67 mg/dL, the data set is divided into two groups: group 1 (BSL 57 mg/dL to 67 mg/dL) has 50.67% of the samples, and group 2 (with BSL 68 mg/dL to 79 mg/dL) has 49.33% of the samples. In regression analysis, 5-fold cross-validation is performed. The decision tree (DT) and gradient boosting (GB) individually provide a prediction accuracy of 18.2%. Regarding classification analysis, a 10-fold cross-validation configuration is used for training and testing the model. AdaBoost, GB, random forest, and neural network achieve an accuracy rate of 66.3% and an area under curve (AUC) score of 0.731. In unsupervised learning, the datasets are divided into three clusters. The clustering result is used in regression and classification models using transfer learning. The accuracy and precision of the AdaBoost and GB are as follows: 69.6%, 0.696 with f1 0.661 and 69.6%, 0.708 with f1 0.708, respectively.
Read full abstract