Abstract

Purpose: Radiomic studies, where correlations are drawn between patients’ medical image features and patient outcomes, often deal with small datasets. Consequently, results can suffer from lack of replicability and stability. This paper establishes a methodology to assess and reduce the impact of statistical fluctuations that may occur in small datasets. Such fluctuations can lead to false discoveries, particularly when applying feature selection or machine learning (ML) methods commonly used in the radiomics literature. Methods: Two feature selection methods were created, one for choosing single predictive features, and another for obtaining features sets that could be combined in a predictive model. The features were combined using ML tools less affected by overfitting (Naive Bayes, logistic regression, and linear support vector machines). Only three features were allowed to be combined at a time, further limiting overfitting. This methodology was applied to MR images from small datasets in metastatic liver disease (69 samples) and primary uterine adenocarcinoma (93 samples), and the outcomes studied were: desmoplasia (for liver metastases), lymphovascular space invasion (LVSI), cancer staging (FIGO), and tumor grade (for uterine tumors). For outcomes in uterine cancer, the predictive models were tested on independent subsets. Results: With respect to the combined predictive feature approach: for LVSI, a prognostic factor that a human reader cannot detect, the predictive model yielded AUC = 0.87 ± 0.07 and accuracy = 0.84 ± 0.09 in the testing set. For FIGO staging, AUC = 0.81 ± 0.03 and accuracy = 0.79 ± 0.08. For tumor grade, AUC = 0.76 ± 0.05 and accuracy = 0.70 ± 0.08. Conclusion: Despite considering a large set ( $\sim 10^{4}$ ) of texture features, the false discovery avoidance methodology allowed only robust predictive models to be retained. Thus, the stringent false discovery avoidance methods introduced here do not preclude the discovery of promising correlations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.