3541 Background: There is a growing incidence of colorectal cancer (CRC) among young adults and persistent disparities in outcomes by race/ethnicity across all ages. Gene expression signatures, as well as consensus molecular subtypes (CMS) derived from these, have been proposed to predict prognosis and therapy response in CRC. However, it is unclear whether gene expression or CMS are associated with racial disparities observed in CRC. We assessed whether race or genetic ancestry are associated with CMS or gene expression patterns in a deidentified cohort of 1,768 CRC patients. Methods: Patients tumors’ underwent tumor profiling with the Tempus xT NGS 648-gene assay as well as full-transcriptome RNA sequencing. We used a set of 654 ancestry-informative markers to infer genetic ancestry likelihoods for Africa (AFR), America (AMR), East Asia (EAS), Europe (EUR), and South Asia (SAS). Race/ethnicity labels, often missing in real-world data, were imputed using ancestry proportions from the literature and adjusted based on observed metadata. Gene expression data was used to assign CMS to all patients (CMS1, 2, 3, 4 and indeterminate) using CMScaller, and multinomial logistic regression was used to assess associations with race/ethnicity imputed labels and ancestry proportions. We then assessed differential expression (DE) in the MSigDB hallmark and C2 BioCarta gene sets using four separate workflows. The first two workflows used limma-voom followed by ROAST to assess DE among the imputed labels and then among the ancestry proportions (isometric log ratio transformed). The second two workflows used GSVA followed by limma. Results: Among 1,768 patients, 240 were imputed non-Hispanic (NH) Black, 94 NH Asian, 261 Hispanic/Latino/Native American (HLN), and 1,173 NH White. NH Black patients had higher odds of CMS3 vs CMS1 (OR = 2.66, p < 0.001) and HLN patients had higher odds of indeterminate CMS vs CMS1 (OR = 1.90, p = 0.020), compared to NH White. AFR ancestry was significantly associated with CMS3 (OR = 1.05 per doubling in AFR proportion, p = 0.047) and indeterminate CMS (OR = 1.07, p = 0.023). In the gene set analysis, NH Black race/ethnicity was associated with over-expression of the BioCarta WNT pathway gene set. Both AFR ancestry and NH Black race/ethnicity were associated with under-expression of the MSigDB hallmark coagulation and BioCarta alternative complement gene sets and over-expression of the PITX2 pathway gene set (all significant with both ROAST and GSVA). Conclusions: We found that NH Black patients and AFR ancestry were associated with higher rates of CMS3, which is associated with KRAS mutation and was previously reported to be more common among Black patients. Indeterminate CMS associations with AFR ancestry and HLN highlights the need to use diverse patient cohorts when training unsupervised learning models to improve prognosis prediction in non-White patients.
Read full abstract