Deep learning of 2D-Restructured gene expression representations for improved low-sample therapeutic response prediction

Kai Ping Cheng,Wan Xiang Shen,Yu Yang Jiang,Yan Chen,Yu Zong Chen,Ying Tan

doi:10.1016/j.compbiomed.2023.107245

Abstract

Clinical outcome prediction is important for stratified therapeutics. Machine learning (ML) and deep learning (DL) methods facilitate therapeutic response prediction from transcriptomic profiles of cells and clinical samples. Clinical transcriptomic DL is challenged by the low-sample sizes (34–286 subjects), high-dimensionality (up to 21,653 genes) and unordered nature of clinical transcriptomic data. The established methods rely on ML algorithms at accuracy levels of 0.6–0.8 AUC/ACC values. Low-sample DL algorithms are needed for enhanced prediction capability. Here, an unsupervised manifold-guided algorithm was employed for restructuring transcriptomic data into ordered image-like 2D-representations, followed by efficient DL of these 2D-representations with deep ConvNets. Our DL models significantly outperformed the state-of-the-art (SOTA) ML models on 82% of 17 low-sample benchmark datasets (53% with >0.05 AUC/ACC improvement). They are more robust than the SOTA models in cross-cohort prediction tasks, and in identifying robust biomarkers and response-dependent variational patterns consistent with experimental indications.

Full Text