Abstract Introduction: Head and neck cancer is a deadly disease with relatively stagnant cure rates. Predictive oncology might allow treatment stratification for surgery, radiation, and chemotherapy combinations based on both stage and other risks. As more high dimensional data from genomic expression datasets become publicly available, machine learning algorithms could be used for staging/treatment refinement. In the present study, we applied machine learning algorithms to our previously published smaller U133A Affymetrix gene expression dataset and the NCI Cancer Genome Atlas (TCGA). Methods: Models were trained to classify T and N staging data for TCGA and Affymetrix chip datasets, by early (T1/2) versus advanced T stage (T3+), and for N0 versus N1+, respectively, for TCGA and Affymetrix data separately. TCGA data was split according to HPV status into two datasets. Models to predict early vs. advanced stage were trained and tested on randomly (proportional with respect to staging) split data: large dataset (HPV- T and HPV- N, n>=350) models were trained on 80% and tested on the unseen 20% samples, while for the remainder (n<=50), data were split 70%-30% for sufficient testing. Scikit-learn models utilized were: Bagged (aggregated) Support Vector Classifiers (BSVCs), Random Forest (RF), Gradient Boosting (GB), and Linear Regression (LR). LR was added unaltered as a baseline prediction, while the other three models were subjected to hyperparameter tuning (assessed for all combinations of selected settings) and feature selection: models were trained and tested on whole expression data, iteratively reduced to the top half important genes down to <=100 genes. Models were assessed by test accuracy and area under receiver operator characteristic curve (AUC-ROC), then hyperparameter-tuned models and respective gene lists were assessed over 50 trials (random initializations and data splits). Results: Hyperparameter-tuned RF and BSVCs models with selected gene lists significantly outperformed LR in 6/6 tasks, while GB outperformed LR in 2/6 tasks. Example highest performing models (test accuracy [95% CI], AUC-ROC [95% CI]) were: For Affymetrix N, BSVCs (0.893 [0.872-0.915], 0.95 [0.931-0.969] outperformed LR (0.49 [0.453-0.527], 0.476 [0.435-0.517]). For TCGA HPV- N, BSVCs (0.832 [0.818-0.847], 0.892 [0.877-0.907]) outperformed LR (0.581 [0.568-0.595], 0.615 [0.6-0.631]). For TCGA HPV+ T, RF (0.835 [0.815-0.854], 0.954 [0.942-0.965]) outperformed LR (0.669 [0.645-0.694], 0.76 [0.732-0.789]). Conclusions: Utilizing this hyperparameter tuning and feature selection approach, we have identified machine learning techniques that can be used to analyze high dimensional data of small and large genomic expression datasets to predict clinical features and identify predictive genes for further analysis, applicable to head and neck cancer, a relatively rare malignancy (about 3% of total cancers). Citation Format: Nathan DeMichaelis, Amrit Menon, Dalton Schutte, Frank G. Ondrey. Utilizing machine learning to predict clinical staging with head and neck cancer gene expression datasets [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2025; Part 1 (Regular Abstracts); 2025 Apr 25-30; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2025;85(8_Suppl_1):Abstract nr 2400.
Read full abstract