Abstract

Abstract Background Transcriptomic have led to the now widely used sub-type based classification of breast cancer first described by Perou in 2000. Yet there persists heterogeneity in biological behaviors within breast cancer subtypes, underlining the need to refine the taxonomy of breast cancer. Metabolomics is a rapidly expanding field dedicated to the study of metabolism which integrates the impact of the environment on cell biology. The aim of this study was to identify new biological breast cancer clusters using different unsupervised machine learning (ML) methods based on metabolomic features. Those methods, in which no a priori class label information is given to guide the algorithm, seem suitable to address this type of problem. Methods 52 patients with breast cancer and an indication for adjuvant chemotherapy between 2013 and 2016, were retrospectively included. Tumor resection specimens were analyzed. 1300 metabolomic were extracted by combined liquid chromatography-mass spectroscopy and processed using MZmine software and the “Human Metabolome” database. 5 unsupervised ML methods were used: PCA-Kmeans, Sparcl, SIMLR, Spectral clustering and K-sparse. Clinical differences between clusters and variations for every metabolite of interest were analyzed for each clustering method. Cluster separability and homogeneity was evaluated using the silhouettes method and t-sne visual evaluation. Results Among the 5 clustering methods, with a partitioning optimum parameter k=3, only K-sparse and SIMLR methods generated 3 clusters with significant clinical differences, unmatched to traditional subtypes. These differences concerned: tumor stage, axillary lymph node invasion, histological grade, ki-67 proliferation index, and tumor phenotype. With a silhouette average of 0.84 and 0.85 for K-sparse and SIMLR methods respectively, those 2 methods gave the best score in terms of silhouette average and they showed a better gradient for tumor aggressiveness compared to the 3 other methods. Among them, 42 and 55 metabolites were selected for the construction of tumor metabolome profiles for K-sparse and SIMLR, respectively. Among selected metabolites we found a significant increase of L-methionine, L-phenylalanine, L-isoleucine and L-proline along with a significant decrease in glutathione (also characteristic of oxidative stress) and glutamate in the cluster associated with poorer histopronostic factors. This high concentration of proteinogenic amino-acid and low concentration of amino-acid precursors could be correlated to poorer prognosis. Conclusion Unsupervised ML methods generate heterogeneous results when applied to metabolomics data extracted from breast cancer patients. K-sparse and SIMLR were able to identify three different groups based on tumor metabolome. Tumors with the worst histopronostic factors seemed to present higher concentrations of protienogenic amino-acids. Note: This abstract was not presented at the meeting. Citation Format: Jocelyn Gal, Caroline Bailleux, David Chardin, Thierry Pourcher, Lun Jing, Jean-Marie Guignonis, Jean-Marc Ferrero, Renaud Schiappa, Emmanuel Chamorey, Olivier Humbert. Unsupervised machine learning methods reveal metabolomic based clusters in breast cancer patients [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 2449.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call