The present study comprises a comparison of the Mel Frequency Cepstral Coefficients (MFCC), Principal Component Analysis (PCA) and Independent Component Analysis (ICA) as feature extraction methods using ten different regression algorithms (AdaBoost, Bayesian Ridge, Decision Tree, Elastic Net, k-NN, Linear Regression, MLP, Random Forest, Ridge Regression and Support Vector Regression) to quantify the blood glucose concentration. A total of 122 participants—healthy and diagnosed with type 2 diabetes—were invited to be part of this study. The entire set of participants was divided into two partitions: a training subset of 72 participants, which was intended for model selection, and a validation subset comprising the remaining 50 participants, to test the selected model. A 3D-printed chamber for providing a light-controlled environment and a low-cost microcontroller unit were used to acquire optical measurements. The MFCC, PCA and ICA were calculated by an open-hardware computing platform. The glucose levels estimated by the system were compared to actual glucose concentrations measured by venipuncture in a laboratory test, using the mean absolute error, the mean absolute percentage error and the Clarke error grid for this purpose. The best results were obtained for MCCF with AdaBoost and Random Forest (MAE = 11.6 for both).
Read full abstract