Abstract

It is very significant to explore the intrinsic differences in breast cancer subtypes. These intrinsic differences are closely related to clinical diagnosis and designation of treatment plans. With the accumulation of biological and medicine datasets, there are many different omics data that can be viewed in different aspects. Combining these multiple omics data can improve the accuracy of prediction. Meanwhile; there are also many different databases available for us to download different types of omics data. In this article, we use estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) to define breast cancer subtypes and classify any two breast cancer subtypes using SMO-MKL algorithm. We collected mRNA data, methylation data and copy number variation (CNV) data from TCGA to classify breast cancer subtypes. Multiple Kernel Learning (MKL) is employed to use these omics data distinctly. The result of using three omics data with multiple kernels is better than that of using single omics data with multiple kernels. Furthermore; these significant genes and pathways discovered in the feature selection process are also analyzed. In experiments; the proposed method outperforms other state-of-the-art methods and has abundant biological interpretations.

Highlights

  • Breast cancer is the most general cancer diagnosed in women, and it is the main cause of cancer death behind lung cancer [1]

  • We firstly showed the accuracy and the area under the curve (AUC) of classification on any two breast cancer subtypes

  • The parameters in polynomial kernel and gaussian kernel could affect the performance of breast cancer subtypes prediction, which meant that we could improve the accuracy through rationally tuning parameters

Read more

Summary

Introduction

Breast cancer is the most general cancer diagnosed in women, and it is the main cause of cancer death behind lung cancer [1]. There are more and more people with breast cancer, and 6.6% of patients are women diagnosed below the age of 40 [2]. Young women with breast cancer are more likely to have more aggressive subtypes, such as triple-negative or HER2-positive breast cancer, and are more likely to be identified as advanced stages [1]. Breast cancer is a high heterogeneity disease, and it is comprised of distinct biological subtypes which present a varied spectrum of clinical, pathologic and molecular features with different prognostic and therapeutic implications [3]. The studies on the genotyping of breast cancer are very important for breast cancer treatment decisions and prognosis prediction [4]. Recent studies have been directed at molecular classification of breast cancer. With the development of high-throughput research techniques such as microarray, genotyping can

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call