Association Analysis of Deep Genomic Features Extracted by Denoising Autoencoders in Breast Cancer

Qian Liu,Pingzhao Hu

doi:10.3390/cancers11040494

Abstract

Artificial intelligence-based unsupervised deep learning (DL) is widely used to mine multimodal big data. However, there are few applications of this technology to cancer genomics. We aim to develop DL models to extract deep features from the breast cancer gene expression data and copy number alteration (CNA) data separately and jointly. We hypothesize that the deep features are associated with patients’ clinical characteristics and outcomes. Two unsupervised denoising autoencoders (DAs) were developed to extract deep features from TCGA (The Cancer Genome Atlas) breast cancer gene expression and CNA data separately and jointly. A heat map was used to view and cluster patients into subgroups based on these DL features. Fisher’s exact test and Pearson’ Chi-square test were applied to test the associations of patients’ groups and clinical information. Survival differences between the groups were evaluated by Kaplan–Meier (KM) curves. Associations between each of the features and patient’s overall survival were assessed using Cox’s proportional hazards (COX-PH) model and a risk score for each feature set from the different omics data sets was generated from the survival regression coefficients. The risk scores for each feature set were binarized into high- and low-risk patient groups to evaluate survival differences using KM curves. Furthermore, the risk scores were traced back to their gene level DAs weights so that the three gene lists for each of the genomic data points were generated to perform gene set enrichment analysis. Patients were clustered into two groups based on concatenated features from the gene expression and CNA data and these two groups showed different overall survival rates (p-value = 0.049) and different ER (Estrogen receptor) statuses (p-value = 0.002, OR (odds ratio) = 0.626). All the risk scores from the gene expression and CNA data and their concatenated one were significantly associated with breast cancer survival. The patients with the high-risk group were significantly associated with patients’ worse outcomes (p-values ≤ 0.0023). The concatenated risk score was enriched by the AMP-activated protein kinase (AMPK) signaling pathway, the regulation of DNA-templated transcription, the regulation of nucleic acid-templated transcription, the regulation of apoptotic process, the positive regulation of gene expression, the positive regulation of cell proliferation, heart morphogenesis, the regulation of cellular macromolecule biosynthetic process, with FDR (false discovery rate) less than 0.05. We confirmed DAs can effectively extract meaningful genomic features from genomic data and concatenating multiple data sources can improve the significance of the features associated with breast cancer patients’ clinical characteristics and outcomes.

Highlights

Advanced hardware technologies have highly increased computational power, which makes the implementation of computation-consuming algorithms possible
There were no clear patterns shown in the deep features from a single genomic source
We showed that unsupervised denoising autoencoders (DAs) as an effective model to extract meaningful deep genomic features from either single- or multi- genomic sources from breast cancer patients

Summary

Introduction

Advanced hardware technologies have highly increased computational power, which makes the implementation of computation-consuming algorithms possible. The development of biological technologies has greatly reduced the cost of genomic sequencing, which produced a huge amount of high-dimensional genomic data. Under these circumstances, bioinformatics becomes an exciting research field for researchers to explore the possibility to interpret genomic data using advanced computational technologies [1]. Different types of high-dimensional genomic data have been associated with cancer clinical characteristics and outcomes. The activity of gene expression in tumor tissues is quite different from that in normal tissues [3] and has been established to have the ability to distinguish the characteristics of cancers [4]. There are some repeated segments in normal DNA, and during the process of cancer development, the repeated number of the segments may be changed due to abnormal

Objectives

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Cancers	Publication Date: Apr 7, 2019
Citations: 23	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Association Analysis of Deep Genomic Features Extracted by Denoising Autoencoders in Breast Cancer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cancers

Lead the way for us

Similar Papers

Ordering copy number alteration data to analyze colorectal cancer progression
Iuliana M Bocicor ... Alex Graudenzi
EMBnet.journal | VOL. 18
Iuliana M Bocicor, et. al.Iuliana M Bocicor ... Alex Graudenzi
09 Nov 2012
EMBnet.journal | VOL. 18

UNSUPERVISED FEATURE CONSTRUCTION AND KNOWLEDGE EXTRACTION FROM GENOME-WIDE ASSAYS OF BREAST CANCER WITH DENOISING AUTOENCODERS
Jie Tan ... Casey S Greene
-
Jie Tan, et. al.Jie Tan ... Casey S Greene
01 Nov 2014
01 Nov 2014

Identification of genetic determinants of breast cancer immune phenotypes by integrative genome-scale analysis
Wouter Hendrickx ... Davide Bedognetti
OncoImmunology | VOL. 6
Wouter Hendrickx, et. al.Wouter Hendrickx ... Davide Bedognetti
01 Feb 2017
OncoImmunology | VOL. 6

P2.04-018 Comprehensive Copy Number Alteration and Gene Expression Analysis of Surgically Resected Thymic Carcinoma: Topic: Thymic Malignancies Clinical and Translational
Takao Nakanishi ... Hiroshi Date
Journal of Thoracic Oncology | VOL. 12
Takao Nakanishi, et. al.Takao Nakanishi ... Hiroshi Date
31 Dec 2016
Journal of Thoracic Oncology | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Association Analysis of Deep Genomic Features Extracted by Denoising Autoencoders in Breast Cancer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cancers