Abstract BACKGROUND Circulating cell-free DNA (ccfDNA) is a promising tool for monitoring patients with high-grade gliomas (HGGs) in addition to follow-up with conventional (cMRI) and advanced MRI (aMRI), which harbors an enormous amount of quantitative subvisual data that can be used to build models predicting disease behavior. Examining the relationship between ccfDNA levels and quantitative features extracted from cMRI and aMRI gives further insight into this novel biomarker, aiming to combine the biological value of ccfDNA and the spatial resolution of image biomarkers to improve HGGs monitoring. MATERIAL AND METHODS Features were extracted from each T1 post-contrast and FLAIR images by using the FLAIR hyperintensity and enhancement automatic segmentation tool (Pyradiomics software, v2.2.0). The radiomic dataset underwent Z-score normalization, feature reduction, and feature selection. Highly correlated features were excluded using a threshold of 0.9 person-correlation-coefficient. The ‘F_regression’ function was used to select the most important features concerning the ccfDNA concentration, Overall survival (OS), and progression-free survival (PFS). This dataset was used as input for four machine learning models: Linear Regression (LR), Support Vector Regression (SVR), Random Forest (RF), and least absolute shrinkage and selection operator (LASSO). Hyperparameter tuning was performed at the first training iteration of each model using a 4-fold cross-validation GridSearchCV and model-specific parameters. The train-test strategy involved randomizing datasets into 80% training and 20% testing subsets across 100 iterations, ensuring robustness and reducing biases. RMSE values of each run were averaged based on sample test frequencies. Root means square error (RMSE) was used to assess the model’s performance. RESULTS 405 features were successfully extracted from T1 post-contrast and FLAIR images from both tumoral masks. A total of 287 were found to be highly correlated and, therefore, removed from the dataset. The ‘F_regression’ with a 90-percentile threshold separates 12 features divided into 3 shapes features relative to FLAIR segmentation, 6 features relative to the T1 post-contrast image, and 3 features for the FLAIR image. The SVR achieves the best RMSE performances when compared to other models when predicting the ccfDNA concentration, reaching an RMSE of 8.58. An analogous strategy was used for the models built to predict PFS and OS reaching an RMSE of 5.89 and 6.77, respectively. CONCLUSION Machine-learning models based on radiomic features combined with clinical variables are valuable tools for predicting OS and PFS in HGGs’ patients. The radiomic signature provides additional information compared to the tumoral volume to forecast ccfDNA levels. Combining both techniques could provide more precise assessment of disease status.
Read full abstract