Artificial intelligence (AI) models in real-world implementation are scarce. Our study aimed to develop a CT angiography (CTA)-based AI model for intracranial aneurysm detection, assess how it helps clinicians improve diagnostic performance, and validate its application in real-world clinical implementation. We developed a deep-learning model using 16 546 head and neck CTA examination images from 14 517 patients at eight Chinese hospitals. Using an adapted, stepwise implementation and evaluation, 120 certified clinicians from 15 geographically different hospitals were recruited. Initially, the AI model was externally validated with images of 900 digital subtraction angiography-verified CTA cases (examinations) and compared with the performance of 24 clinicians who each viewed 300 of these cases (stage 1). Next, as a further external validation a multi-reader multi-case study enrolled 48 clinicians to individually review 298 digital subtraction angiography-verified CTA cases (stage 2). The clinicians reviewed each CTA examination twice (ie, with and without the AI model), separated by a 4-week washout period. Then, a randomised open-label comparison study enrolled 48 clinicians to assess the acceptance and performance of this AI model (stage 3). Finally, the model was prospectively deployed and validated in 1562 real-world clinical CTA cases. The AI model in the internal dataset achieved a patient-level diagnostic sensitivity of 0·957 (95% CI 0·939-0·971) and a higher patient-level diagnostic sensitivity than clinicians (0·943 [0·921-0·961] vs 0·658 [0·644-0·672]; p<0·0001) in the external dataset. In the multi-reader multi-case study, the AI-assisted strategy improved clinicians' diagnostic performance both on a per-patient basis (the area under the receiver operating characteristic curves [AUCs]; 0·795 [0·761-0·830] without AI vs 0·878 [0·850-0·906] with AI; p<0·0001) and a per-aneurysm basis (the area under the weighted alternative free-response receiver operating characteristic curves; 0·765 [0·732-0·799] vs 0·865 [0·839-0·891]; p<0·0001). Reading time decreased with the aid of the AI model (87·5 s vs 82·7 s, p<0·0001). In the randomised open-label comparison study, clinicians in the AI-assisted group had a high acceptance of the AI model (92·6% adoption rate), and a higher AUC when compared with the control group (0·858 [95% CI 0·850-0·866] vs 0·789 [0·780-0·799]; p<0·0001). In the prospective study, the AI model had a 0·51% (8/1570) error rate due to poor-quality CTA images and recognition failure. The model had a high negative predictive value of 0·998 (0·994-1·000) and significantly improved the diagnostic performance of clinicians; AUC improved from 0·787 (95% CI 0·766-0·808) to 0·909 (0·894-0·923; p<0·0001) and patient-level sensitivity improved from 0·590 (0·511-0·666) to 0·825 (0·759-0·880; p<0·0001). This AI model demonstrated strong clinical potential for intracranial aneurysm detection with improved clinician diagnostic performance, high acceptance, and practical implementation in real-world clinical cases. National Natural Science Foundation of China. For the Chinese translation of the abstract see Supplementary Materials section.
Read full abstract