ObjectiveThis study develops and evaluates multimodal machine learning models for differentiating bacterial and fungal keratitis using a prospective, representative dataset from South India. DesignMachine learning classifier training and validation study. Participants599 subjects diagnosed with acute infectious keratitis at Aravind Eye Hospital in Madurai, India. MethodsWe developed and compared three prediction models to distinguish bacterial and fungal keratitis using a prospective, consecutively-collected, representative dataset gathered over a full calendar year (the MADURAI dataset). These models included a clinical data model, a computer vision model using the EfficientNet architecture, and a multimodal model combining both imaging and clinical data. We partitioned the MADURAI dataset into 70% train/validation and 30% test sets. Model training was performed with 5-fold cross-validation. We also compared the performance of the MADURAI-trained computer vision model against a model with identical architecture but trained on a pre-existing dataset collated from multiple prior bacterial and fungal keratitis randomized clinical trials (the RCT-trained computer vision model). Main Outcome MeasuresThe primary evaluation metric was the area under the precision-recall curve (AUPRC). Secondary metrics included area under the receiver operating curve (AUROC), accuracy, and F1 score. ResultsThe MADURAI-trained computer vision model outperformed the clinical data model and the RCT-trained computer vision model on the hold-out test set, with an AUPRC 0.94 (95% CI: 0.92-0.96), AUROC 0.81 (0.76-0.85), accuracy 77%, and F1 score 0.85. The multimodal model did not substantially improve performance compared to the computer vision model. ConclusionsThe best-performing machine learning classifier for infectious keratitis was a computer vision model trained using the MADURAI dataset. These findings suggest that image-based deep learning could significantly enhance diagnostic capabilities for infectious keratitis, and emphasize the importance of using prospective, consecutively-collected, representative data for machine learning model training and evaluation.
Read full abstract