"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To apply conformal prediction to a deep learning (DL) model for intracranial hemorrhage (ICH) detection and evaluate model performance in detection as well as model accuracy in identifying challenging cases. Materials and Methods This was a retrospective (November 2017 through December 2017) study of 491 noncontrast head CT volumes from the CQ500 dataset in which three senior radiologists annotated sections containing ICH. The dataset was split into definite and challenging (uncertain) subsets, where challenging images were defined as those in which there was disagreement among readers. A DL model was trained on 146 patients (mean age = 45.7, 70 females, 76 males) from the definite data (training dataset) to perform ICH localization and classification into five classes. To develop an uncertainty-aware DL model, 1,546 sections of the definite data (calibration dataset) was used for Mondrian conformal prediction (MCP). The uncertainty-aware DL model was tested on 8,401 definite and challenging sections to assess its ability to identify challenging sections. The difference in predictive performance (P value) and ability to identify challenging sections (accuracy) were reported. Results After the MCP procedure, the model achieved an F1 score of 0.920 for ICH classification on the test dataset. Additionally, it correctly identified 6,837 of the 6,856 total challenging sections as challenging (99.7% accuracy). It did not incorrectly label any definite sections as challenging. Conclusion The uncertainty-aware MCP-augmented DL model achieved high performance in ICH detection and high accuracy in identifying challenging sections, suggesting its usefulness in automated ICH detection and potential to increase trustworthiness of DL models in radiology. ©RSNA, 2024.
Read full abstract