Improved Training Efficiency for Retinopathy of Prematurity Deep Learning Models Using Comparison versus Class Labels.

Adam Hanif,Beyza Kalkanlı,Deniz Erdoğmuş,J Peter Campbell,Peng Tian,Michael F Chiang,Karyn Jonas,Jayashree Kalpathy-Cramer,Susan Ostmo,Jennifer Dy,Stratis Ioannidis,İlkay Yıldız,R V Paul Chan

doi:10.1016/j.xops.2022.100122

Abstract

PurposeTo compare the efficacy and efficiency of training neural networks for medical image classification using comparison labels indicating relative disease severity versus diagnostic class labels from a retinopathy of prematurity (ROP) image dataset.DesignEvaluation of diagnostic test or technology.ParticipantsDeep learning neural networks trained on expert-labeled wide-angle retinal images obtained from patients undergoing diagnostic ROP examinations obtained as part of the Imaging and Informatics in ROP (i-ROP) cohort study.MethodsNeural networks were trained with either class or comparison labels indicating plus disease severity in ROP retinal fundus images from 2 datasets. After training and validation, all networks underwent evaluation using a separate test dataset in 1 of 2 binary classification tasks: normal versus abnormal or plus versus nonplus.Main Outcome MeasuresArea under the receiver operating characteristic curve (AUC) values were measured to assess network performance.ResultsGiven the same number of labels, neural networks learned more efficiently by comparison, generating significantly higher AUCs in both classification tasks across both datasets. Similarly, given the same number of images, comparison learning developed networks with significantly higher AUCs across both classification tasks in 1 of 2 datasets. The difference in efficiency and accuracy between models trained on either label type decreased as the size of the training set increased.ConclusionsComparison labels individually are more informative and more abundant per sample than class labels. These findings indicate a potential means of overcoming the common obstacle of data variability and scarcity when training neural networks for medical image classification tasks.

Full Text