Abstract

Cancer can educate platelets by altering transcriptome profiles. However, the exact education mechanism remains unclear, and the variability of tumor-educated platelet (TEP) transcriptome has not been investigated. In this study, we aimed to build a stratification system for TEP based on machine learning (ML) data-driven patterns and platelet transcriptome profiles. This study included platelet samples from 1,628 cancer participants from European and United States populations, including 18 different and most prevalent types of cancer. Gaussian mixture model (GMM) was used to identify robust clusters and similar education pattern. While extreme gradient boosting (XGBoost) was used to precisely predict the clusters. Three clusters were eventually identified. The cluster results showed robustness and generality, reflected by comparable patterns of important gene expression, cancer type prevalence, and biological annotation across derivation, evaluation and validation cohorts. Cluster 1 (n = 346), mainly participated in drug metabolism cytochrome P450, metabolism of xenobiotics by cytochrome P450, and glutathione metabolism. Cluster 2 (n = 538) mainly participated in ribosome, spliceosome, and primary immunodeficiency. Cluster 3 (n = 744) mainly participated in gap junction and focal adhesion. Based on this novel cluster system, further observational study can investigate the association between these clusters and cancer progression, prognosis, cancer associated thrombosis, treatment resistance (both chemotherapy and immunotherapy), and immune cell infiltration. Overall, in this study, we built the first pan-cancer TEP stratification system based on data-driven patterns of ML and platelet transcriptional profiles. These clusters could help us better understand the variability of the pan-cancer education mechanism.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call