Abstract

The progression of biotechnological and health science fields has instigated a substantial proliferation of data, encompassing high-throughput genetic information and extensive clinical data derived from expansive electronic health record repositories. In the domain of biosciences, the indispensability of machine learning and data mining methodologies has surged, attaining heightened significance for the purpose of converting extant information into usable knowledge. Diabetes mellitus is a metabolic condition that affects human health. Extensive research on diabetes (diagnosis, therapy, etc.) has generated vast volumes of data. Manual diagnosis of diabetes has always been a time-consuming task. Therefore, automatic detection and diagnosis of diabetes using artificial intelligence and machine learning is gaining prominence. In our work, we have devised a novel architecture using machine learning for the automatic diagnosis of diabetes. We implemented our model using many algorithms and found that random forest is the most optimized and accurate one for classification purposes. It produced the highest accuracy of 98.07%, precision, recall, F1-score of 98%, and logarithmic loss of 0.03 using the mRMR feature selection method and 0.2 test split. It produced a recall of 97%, the precision of 97%, an F1-score of 97%, an accuracy of 96.79%, and a log loss of 0.11 on 0.3 test split with mRMR feature selection. Also, the proposed model has performed better than most state-of-the-art models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call