Abstract

BackgroundIdentifying molecular subtypes of ovarian cancer is important. Compared to identify subtypes using single omics data, the multi-omics data analysis can utilize more information. Autoencoder has been widely used to construct lower dimensional representation for multi-omics feature integration. However, learning in the deep architectures in Autoencoder is difficult for achieving satisfied generalization performance. To solve this problem, we proposed a novel deep learning-based framework to robustly identify ovarian cancer subtypes by using denoising Autoencoder.ResultsIn proposed method, the composite features of multi-omics data in the Cancer Genome Atlas were produced by denoising Autoencoder, and then the generated low-dimensional features were input into k-means for clustering. At last based on the clustering results, we built the light-weighted classification model with L1-penalized logistic regression method. Furthermore, we applied the differential expression analysis and WGCNA analysis to select target genes related to molecular subtypes. We identified 34 biomarkers and 19 KEGG pathways associated with ovarian cancer.ConclusionsThe independent test results in three GEO datasets proved the robustness of our model. The literature reviewing show 19 (56%) biomarkers and 8(42.1%) KEGG pathways identified based on the classification subtypes have been proved to be associated with ovarian cancer. The outcomes indicate that our proposed method is feasible and can provide reliable results.

Highlights

  • Identifying molecular subtypes of ovarian cancer is important

  • Due to the lack of robustness of AE, it is difficult to extract the most informative features from high-dimensional multiomics data in practical applications. Trying to solve this problem, we proposed a novel deep learning framework for integrating multi-omics data with denoising autoencoder (DAE), and the generated features were input into k-means for clustering (DAE-kmeans)

  • The methods based on traditional dimensionality reduction methods (PCA, kernel PCA (KPCA)) performed only better than k-means and hierarchical clustering, but worse than Sparse K-means (SparseK), iCluster and two deep learning-based methods

Read more

Summary

Introduction

Identifying molecular subtypes of ovarian cancer is important. Compared to identify subtypes using single omics data, the multi-omics data analysis can utilize more information. Bodelon et al identify ovarian cancer subtypes using DNA methylation profiling with nonnegative matrix factorization (NMF) clustering algorithm [5], Macintyre et al proved the copy number is related to the ovarian cancer survival and the probability of platinum-resistant relapse by using NMF mixture modeling [6]. These data provide different sight on ovarian cancer research, the results is easy to be affected by the noise and missing data in one type of omics data, and the single omics-data can only provide limited information for ovarian cancer research

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call