Abstract

The high dimension, high redundancy and class imbalance of cancer multiple omics data are the main challenges for cancer diagnosis. Existing studies have neglected the role of functional proteomics in the occurrence and development of cancer. In this study, a novel hybrid feature selection and ensemble learning framework, referred to as the three-stage feature selection and twice-competitional ensemble learning method (TSFS-TCEM), is proposed for cancer diagnosis. Firstly, we combine the transcriptome and functional proteomics data to construct a multi-omics data on breast cancer, which is the first time to apply these combined biological data for diagnosing breast cancer. Secondly, the proposed method introduces multiple models during the feature selection and diagnostic model construction. The three-stage feature selections integrate the features from different types of data and the twice-competitional ensemble learning framework resolves the data imbalance problem suffer from a single classifier. The TSFS-TCEM achieves a diagnostic accuracy of 99.64%, outperforming all compared methods. In addition, the 5-fold cross-validation sensitivity, specificity and F-Measure of the method are above 99.63%.

Highlights

  • Due to the low early detection rate and high mortality rate, cancer has become the main cause of human death

  • Fold change-false discovery rate(FC-FDR) and information gain could quickly filter the features of high-dimensional transcriptome profiles, and functional proteomic was inappropriate for the FC-FDR method due to the data characteristics

  • We proposed the TCEM to classify the imbalanced multi-omics of breast cancer datasets

Read more

Summary

Introduction

Due to the low early detection rate and high mortality rate, cancer has become the main cause of human death. With the development of sequencing technology, cancer has been confirmed to be closely associated with genetics [1]. Non-coding RNA (ncRNA), called ‘‘Dark matter’’, plays a vital role in the transcription process. This means that the abnormal expression of ncRNA may cause disorders in gene expression [2]–[5]. Literature [2] merged mRNA and ncRNA transcriptomes of pancreatic cancer to obtain the altered miRNA regulation of mRNA expression. What’s more, functional proteins have extended the insight on the

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call