Abstract
Clustering is the task of identifying groups of similar subjects according to certain criteria. The AJCC staging system can be thought as a clustering mechanism that groups patients based on their disease stage. This grouping drives prognosis and influences treatment. The goal of this work is to evaluate the efficacy of machine learning algorithms to cluster the patients into discriminative groups to improve prognosis for overall survival (OS) and relapse free survival (RFS) outcomes. We apply clustering over a retrospectively collected data from 644 head and neck cancer patients including both clinical and radiomic features. In order to incorporate outcome information into the clustering process and deal with the large proportion of censored samples, the feature space was scaled using the regression coefficients fitted using a proxy dependent variable, martingale residuals, instead of follow-up time. Two clusters were identified and evaluated using cross validation. The Kaplan Meier (KM) curves between the two clusters differ significantly for OS and RFS (p-value < 0.0001). Moreover, there was a relative predictive improvement when using the cluster label in addition to the clinical features compared to using only clinical features where AUC increased by 5.7% and 13.0% for OS and RFS, respectively.
Highlights
In the era of personalized cancer medicine, innovative sources of meaningful data are critically needed
Aerts et al have reduced a large-scale lung data set with specific radiomics feature which could be cross applied to head neck cancer patients[23]
We evaluate the resulting groups through model comparisons of using its group label as a feature in a Cox Proportional Hazards (Cox) model considering Akaike Information Criterion (AIC) and the likelihood ratio test (LRT), and by evaluating Kaplan Meier (KM) curves
Summary
In the era of personalized cancer medicine, innovative sources of meaningful data are critically needed. The proposed approach combines supervised and unsupervised methods such that clustering can be used to improve prediction of our outcomes of interest in the context of right-censored oropharyngeal head and neck cancer data. The aims of this study are as follows: (1) incorporate outcome information to influence cluster analysis; (2) identify discriminative clusters using patient characteristics available at the time of diagnosis and radiomic signatures; (3) use the cluster labels to stratify the patients and generate KM curves for each cluster, and compare to AJCC stage; and (4) evaluate the predictive performance of including the cluster label as a feature in a Cox model and RSF for OS and RFS outcomes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.