BackgroundApplications of machine learning for the early detection of diseases for which a clear-cut diagnostic gold standard exists have been evaluated. However, little is known about the usefulness of machine learning approaches in the decision-making process for decisions such as insulin initiation by diabetes specialists for which no absolute standards exist in clinical settings.ObjectiveThe objectives of this study were to examine the ability of machine learning models to predict insulin initiation by specialists and whether the machine learning approach could support decision making by general physicians for insulin initiation in patients with type 2 diabetes.MethodsData from patients prescribed hypoglycemic agents from December 2009 to March 2015 were extracted from diabetes specialists’ registries, resulting in a sample size of 4860 patients who had received initial monotherapy with either insulin (n=293) or noninsulin (n=4567). Neural network output was insulin initiation ranging from 0 to 1 with a cutoff of >0.5 for the dichotomous classification. Accuracy, recall, and area under the receiver operating characteristic curve (AUC) were calculated to compare the ability of machine learning models to make decisions regarding insulin initiation to the decision-making ability of logistic regression and general physicians. By comparing the decision-making ability of machine learning and logistic regression to that of general physicians, 7 cases were chosen based on patient information as the gold standard based on the agreement of 8 of the 9 specialists.ResultsThe AUCs, accuracy, and recall of logistic regression were higher than those of machine learning (AUCs of 0.89-0.90 for logistic regression versus 0.67-0.74 for machine learning). When the examination was limited to cases receiving insulin, discrimination by machine learning was similar to that of logistic regression analysis (recall of 0.05-0.68 for logistic regression versus 0.11-0.52 for machine learning). Accuracies of logistic regression, a machine learning model (downsampling ratio of 1:8), and general physicians were 0.80, 0.70, and 0.66, respectively, for 43 randomly selected cases. For the 7 gold standard cases, the accuracies of logistic regression and the machine learning model were 1.00 and 0.86, respectively, with a downsampling ratio of 1:8, which were higher than the accuracy of general physicians (ie, 0.43).ConclusionsAlthough we found no superior performance of machine learning over logistic regression, machine learning had higher accuracy in prediction of insulin initiation than general physicians, defined by diabetes specialists’ choice of the gold standard. Further study is needed before the use of machine learning–based decision support systems for insulin initiation can be incorporated into clinical practice.
Read full abstract