Objective: Triple-negative breast cancer (TNBC) poses significant diagnostic challenges due to its aggressive nature. This research develops an innovative deep learning (DL) model based on the latest multi-omics data to enhance the accuracy of TNBC subtype and prognosis prediction. The study focuses on addressing the constraints of prior studies by showcasing a model with substantial advancements in data integration, statistical performance, and algorithmic optimization. Methods: Breast cancer-related molecular characteristic data, including mRNA, miRNA, gene mutations, DNA methylation, and magnetic resonance imaging (MRI) images, were retrieved from the TCGA and TCIA databases. This study not only compared single-omics with multi-omics machine learning models but also applied Bayesian optimization to innovatively optimize the neural network structure of a DL model for multi-omics data. Results: The DL model for multi-omics data significantly outperformed single-omics models in subtype prediction, achieving a 98.0% accuracy in cross-validation, 97.0% in the validation set, and 91.0% in an external test set. Additionally, the MRI radiomics model showed promising performance, especially with the training set; however, a decrease in performance during transfer testing underscored the advantages of the DL model for multi-omics data in data consistency and digital processing. Conclusion: Our multi-omics DL model presents notable innovations in statistical performance and transfer learning capability, bearing significant clinical relevance for TNBC classification and prognosis prediction. While the MRI radiomics model proved effective, it requires further optimization for cross-dataset application to enhance accuracy and consistency. Our findings offer new insights into improving TNBC classification and prognosis through multi-omics data and DL algorithms.
Read full abstract