Abstract
This study aims to evaluate a deep learning pipeline for detecting clinically significant prostate cancer (csPCa), defined as Gleason Grade Group (GGG) ≥ 2, using biparametric MRI (bpMRI) and compare its performance with radiological reading. The training dataset included 4381 bpMRI cases (3800 positive and 581 negative) across three continents, with 80% annotated using PI-RADS and 20% with Gleason Scores. The testing set comprised 328 cases from the PROSTATEx dataset, including 34% positive (GGG ≥ 2) and 66% negative cases. A 3D nnU-Net was trained on bpMRI for lesion detection, evaluated using histopathology-based annotations, and assessed with patient- and lesion-level metrics, along with lesion volume, and GGG. The algorithm was compared to non-expert radiologists using multi-parametric MRI (mpMRI). The model achieved an AUC of 0.83 (95% CI: 0.80, 0.87). Lesion-level sensitivity was 0.85 (95% CI: 0.82, 0.94) at 0.5 False Positives per volume (FP/volume) and 0.88 (95% CI: 0.79, 0.92) at 1 FP/volume. Average Precision was 0.55 (95% CI: 0.46, 0.64). The model showed over 0.90 sensitivity for lesions larger than 650 mm³ and exceeded 0.85 across GGGs. It had higher true positive rates (TPRs) than radiologists equivalent FP rates, achieving TPRs of 0.93 and 0.79 compared to radiologists' 0.87 and 0.68 for PI-RADS ≥ 3 and PI-RADS ≥ 4 lesions (p ≤ 0.05). The DL model showed strong performance in detecting csPCa on an independent test cohort, surpassing radiological interpretation and demonstrating AI's potential to improve diagnostic accuracy for non-expert radiologists. However, detecting small lesions remains challenging. Question Current prostate cancer detection methods often do not involve non-expert radiologists, highlighting the need for more accurate deep learning approaches using biparametric MRI. Findings Our model outperforms radiologists significantly, showing consistent performance across Gleason Grade Groups and for medium to large lesions. Clinical relevance This AI model improves prostate detection accuracy in prostate imaging, serves as a benchmark with reference performance on a public dataset, and offers public PI-RADS annotations, enhancing transparency and facilitating further research and development.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have