Abstract

Radiomics characterizes tumor phenotypes by extracting large numbers of quantitative features from radiological images. Radiomic features have been shown to provide prognostic value in predicting clinical outcomes in several studies. However, several challenges including feature redundancy, unbalanced data, and small sample sizes have led to relatively low predictive accuracy. In this study, we explore different strategies for overcoming these challenges and improving predictive performance of radiomics-based prognosis for non-small cell lung cancer (NSCLC). CT images of 112 patients (mean age 75 years) with NSCLC who underwent stereotactic body radiotherapy were used to predict recurrence, death, and recurrence-free survival using a comprehensive radiomics analysis. Different feature selection and predictive modeling techniques were used to determine the optimal configuration of prognosis analysis. To address feature redundancy, comprehensive analysis indicated that Random Forest models and Principal Component Analysis were optimum predictive modeling and feature selection methods, respectively, for achieving high prognosis performance. To address unbalanced data, Synthetic Minority Over-sampling technique was found to significantly increase predictive accuracy. A full analysis of variance showed that data endpoints, feature selection techniques, and classifiers were significant factors in affecting predictive accuracy, suggesting that these factors must be investigated when building radiomics-based predictive models for cancer prognosis.

Highlights

  • The current clinical workflows generate thousands of images per patient making it impractical for clinicians to study all the images

  • We investigate the predictive performance of the combinations of 5 unfiltered feature reduction techniques and 8 different classifiers applied to quantitative CT feature of a dataset of Non-small Cell Lung Cancer (NSCLC) patients with 3 clinical outcome namely recurrence, death, and recurrence-free survival

  • As it can be seen for recurrence (REC) in Fig. 2, the best result is achieved by Random Forest (RF) classifier and Near Zero Variance (NZV) feature selection (AUC = 0.76)

Read more

Summary

Introduction

The current clinical workflows generate thousands of images per patient making it impractical for clinicians to study all the images. By capturing the entirety of tumor site and the ability of extracting information from 3D images, radiomics has the distinct advantage of assessing tissue heterogeneity, a well described phenomenon in cancer analysis with varying cell phenotypes. This is in contrast to other clinical procedures such as biopsy where only a small fraction of tumor is sampled with the significant chance that the index tumor is entirely missed[3] leading to misinterpretations and non-optimal clinical decisions. Recent studies have found that radiomic features may have significant associations with clinical outcomes and gene-expression levels[4,5,6,7,8,9]. These negatively affect the prediction accuracy of prognosis models, which need to be addressed when building an efficient radiomics-based prognosis model

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call