Purpose: Unnecessary prostate biopsies for detecting prostate cancer (PCa) should be minimized. Therefore, this study developed a machine learning (ML) model to predict PCa in Korean men and evaluated its usability.Materials and Methods: We retrospectively analyzed clinical data from 928 patients who underwent prostate biopsies at Kangwon National University Hospital between May 2013 and May 2023. Of these, 377 (41.6%) were diagnosed with PCa, and 551 (59.4%) did not have cancer. For external validation, clinical data from 385 patients aged 48–89 years who underwent prostate biopsies from September 2005 to September 2023 at Wonju Severance Christian Hospital were also included. Twenty-two clinical features were used to develop an ML model to predict PCa. Features were selected based on their contributions to model performance, leading to the inclusion of 15 features. A meta-learner was constructed using logistic regression to predict the probability of PCa, and the classifier was trained and validated on randomly extracted training and test sets at an 8:2 ratio.Results: The prostate health index, prostate volume, age, nodule on digital rectal examination, and prostate-specific antigen were the top 5 features for predicting PCa. The area under the receiver operating characteristic curve (AUC) of the meta-learner logistic regression model was 0.89, and the accuracy, sensitivity, and specificity were 0.828, 0.711, and 0.909, respectively. Our model also showed excellent prediction performance for high-grade PCa, with a Gleason score of 7 or higher and an AUC of 0.903. Furthermore, we evaluated the performance of the model using external cohort clinical data and achieved an AUC of 0.863.Conclusions: Our ML model excelled in predicting PCa, specifically clinically significant PCa. Although extensive cross-validation in other clinical cohorts is needed, this ML model is a promising option for future diagnostics.
Read full abstract