Abstract

BackgroundLiver cancer (Hepatocellular carcinoma; HCC) prevalence is increasing and with poor clinical outcome expected it means greater understanding of HCC aetiology is urgently required. This study explored a deep learning solution to detect biologically important features that distinguish prognostic subgroups. A novel architecture of an Artificial Neural Network (ANN) trained with a customised objective function (LRSC) was developed. The ANN should discover new data representations, to detect patient subgroups that are biologically homogenous (clustering loss) and similar in survival (survival loss) while removing noise from the data (reconstruction loss). The model was applied to TCGA-HCC multi-omics data and benchmarked against baseline models that only use a reconstruction objective function (BCE, MSE) for learning. With the baseline models, the new features are then filtered based on survival information and used for clustering patients. Different variants of the customised objective function, incorporating only reconstruction and clustering losses (LRC); and reconstruction and survival losses (LRS) were also evaluated. Robust features consistently detected were compared between models and validated in TCGA and LIRI-JP HCC cohorts.ResultsThe combined loss (LRSC) discovered highly significant prognostic subgroups (P-value = 1.55E−77) with more accurate sample assignment (Silhouette scores: 0.59–0.7) compared to baseline models (0.18–0.3). All LRSC bottleneck features (N = 100) were significant for survival, compared to only 11–21 for baseline models. Prognostic subgroups were not explained by disease grade or risk factors. Instead LRSC identified robust features including 377 mRNAs, many of which were novel (61.27%) compared to those identified by the other losses. Some 75 mRNAs were prognostic in TCGA, while 29 were prognostic in LIRI-JP also. LRSC also identified 15 robust miRNAs including two novel (hsa-let-7g; hsa-mir-550a-1) and 328 methylation features with 71% being prognostic. Gene-enrichment and Functional Annotation Analysis identified seven pathways differentiating prognostic clusters.ConclusionsCombining cluster and survival metrics with the reconstruction objective function facilitated superior prognostic subgroup identification. The hybrid model identified more homogeneous clusters that consequently were more biologically meaningful. The novel and prognostic robust features extracted provide additional information to improve our understanding of a complex disease to help reveal its aetiology. Moreover, the gene features identified may have clinical applications as therapeutic targets.

Highlights

  • Liver cancer (Hepatocellular carcinoma; HCC) prevalence is increas‐ ing and with poor clinical outcome expected it means greater understanding of HCC aetiology is urgently required

  • The clusters obtained by the baseline models binary cross entropy (BCE) and mean squared error (MSE) had lower Silhouette scores (0.18–0.31), indicating poorer cluster quality, i.e. more heterogeneous

  • For BCE and MSE, the number of significant bottleneck features to be used in clustering following filtering, as identified by the univariate Cox models, was quite low and ranged from 11 to 21 out of the bottleneck dimension of 100

Read more

Summary

Introduction

Liver cancer (Hepatocellular carcinoma; HCC) prevalence is increas‐ ing and with poor clinical outcome expected it means greater understanding of HCC aetiology is urgently required. The ANN should discover new data representations, to detect patient subgroups that are biologically homogenous (clustering loss) and similar in sur‐ vival (survival loss) while removing noise from the data (reconstruction loss). Identifying patient subgroups that stratify by survival due to biological differences will be a step forward towards this goal. This information could in future enable precision medicine whereby patients, when profiled using omics technologies, are stratified into subgroups and have their treatments tailored . This approach to patient management could improve overall survival in HCC. The biological information gained through the identification of prognostic subgroups could facilitate the discovery of new biomarkers and targets for therapies

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call