Deep Learning in Single-cell Analysis

Dylan Molho,Zhaoheng Li,Hongzhi Wen,Yixin Wang,Robert Yang,Yuying Xie,Wenzhuo Tang,Jiayuan Ding,Renming Liu,Wei Jin,Julian Venegas,Yu Leo Lei,Patrick Danaher,Runze Su,Jiliang Tang

doi:10.1145/3641284

Abstract

Single-cell technologies are revolutionizing the entire field of biology. The large volumes of data generated by single-cell technologies are high dimensional, sparse, and heterogeneous and have complicated dependency structures, making analyses using conventional machine learning approaches challenging and impractical. In tackling these challenges, deep learning often demonstrates superior performance compared to traditional machine learning methods. In this work, we give a comprehensive survey on deep learning in single-cell analysis. We first introduce background on single-cell technologies and their development, as well as fundamental concepts of deep learning including the most popular deep architectures. We present an overview of the single-cell analytic pipeline pursued in research applications while noting divergences due to data sources or specific applications. We then review seven popular tasks spanning different stages of the single-cell analysis pipeline, including multimodal integration, imputation, clustering, spatial domain identification, cell-type deconvolution, cell segmentation, and cell-type annotation. Under each task, we describe the most recent developments in classical and deep learning methods and discuss their advantages and disadvantages. Deep learning tools and benchmark datasets are also summarized for each task. Finally, we discuss the future directions and the most recent challenges. This survey will serve as a reference for biologists and computer scientists, encouraging collaborations.

Full Text