Abstract

Introduction: Endoscopic severity of ulcerative colitis (UC) predicts clinical outcomes and is essential to guide treatment and evaluate therapeutic response. The Mayo endoscopic score (MES) is commonly used to objectively classify mucosal damage. It ranges from 0 to 3, with a higher score reflecting increased severity. With advances in machine learning, artificial intelligence is being employed for automating image analysis. Convolutional neural network (CNN) is a powerful deep learning method for image recognition, and in this study, we aim to look at diagnostic accuracy parameters of CNN based machine learning algorithms to predict UC severity. Methods: Multiple databases, including Medline, Scopus, and Embase, were searched from inception to May 2022 using specific terms for studies evaluating the diagnostic accuracy parameters of machine learning algorithms in assessing UC severity. Inclusion was restricted to studies that employed CNN based algorithms. Outcomes of interest were the pooled accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Multiple 4X4 contingency Tables assessing the diagnostic accuracy of the algorithms were considered independent of each other as the goal was to study the overall direction of pooled rates and not calculate precise point estimates. Standard meta-analysis methods were employed using the random-effects model, and heterogeneity was assessed using the I2 statistics. Results: 12 studies were included that exclusively used CNN algorithm. Studies that used support vector machines or a combination were excluded. The CNN algorithm was trained and tested to predict Mayo score severity 0, 1, 2 & 3, individually in majority of the studies. In few studies the CNN algorithm was used to differentiate between Mayo 0 vs 1, and Mayo 0-1 vs 2-3. Although, 'ground-truth' differed, individual 4X4 Tables were considered as independent of each other for the purpose of this study. The pooled rate were as follows: Accuracy 91.2% (95% CI; 87.4-93.9, I2=84%), sensitivity 83.9% (79.2-87.7, 89%), specificity 82.3% (89.5-94.4, 84%), PPV 86.5% (980.7-90.8, 89%) and NPV 89.4% (85.8-92.2, 78%) (Figure). Conclusion: Based on our meta-analysis of 12 studies, CNN-based machine learning algorithms demonstrated excellent pooled diagnostic accuracy parameters. Further work seems to be needed to get the NPV >90. Future well-controlled studies are warranted to establish its clinical use in comparison to endoscopists’ assessment of colonoscopic UC severity (Table).Figure 1.: Forest Plot for sensitivities of selected studies Table 1. - Pooled rates of outcomes of interest Outcome Pooled rates (95% confidence interval, I2 % heterogeneity) Accuracy 91.2% (87.4-93.9, 84%)20 datasets Sensitivity 83.9% (79.2-87.7, 89%)30 datasets Specificity 92.3% (89.5-94.4, 84%)30 datasets PPV 86.5% (80.7-90.8, 89%)18 datasets NPV 89.4% (85.8-92.2, 78%)18 datasets

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call