This study primarily aimed to compare the accuracy of three convolutional neural network (CNN) models in measuring the four positions of ocular duction. Further, it secondarily aimed to compare the accuracy of each CNN model in the training dataset versus the ophthalmologist measurements. This study included 526 subjects aged over 18 who visited the ophthalmology outpatient department. Ocular images were captured using mobile phones in various gaze positions and stored anonymously as JPEG files. Ocular duction was measured by assessing corneal light reflex deviation from the central cornea. Ductions were classified into 30, 60, and 90 prism diopters (PD) and full ductions from the primary position. Three CNN models, MobileNet, ResNet, and EfficientNet, were used to classify ocular duction. Their predictive ability was evaluated using the area under the receiver operating characteristic (AUROC) curve. The dataset was divided into the training (2,001 images), evaluation (213 images), and testing (190 images) groups, which were reconstructed using the routine follow-up data of volunteers at the Ophthalmology Department of Phramongkutklao Hospital between February 2023 and June 2023. To evaluate the data, the MobileNet_V3_Large, ResNet101, and EfficientNet_B5 models were utilized to measure duction angles with the receiver operating characteristic (ROC) curves. The training times for MobileNet, ResNet, and EfficientNet were 5.54, 9.56, and 26.39 minutes, respectively. In the testing phase, MobileNet, ResNet, and EfficientNet were used to measure each duction position: 30 PD with corresponding ROC curve values of 0.77, 0.5, and 0.58; 60 PD with ROC curve values of 0.71, 0.83, and 0.81; 90 PD with ROC curve values of 0.7, 0.73, and 0.81; and full duction with ROC curve values of 0.91, 0.93, and 0.94, respectively. Analysis of variance revealed no significant difference in the mean AUROC curves among the models, yielding a p-value of 0.936.MobileNet has the narrowest confidence intervals for average prediction accuracy across three CNN models. The three CNN models did not significantly differ in terms of efficacy in detecting various duction positions. However, MobileNet stands out, with a narrower confidence interval and shorter training time, which indicates its potential application.
Read full abstract