Abstract

Knowledge of cell type composition in disease relevant tissues is an important step towards the identification of cellular targets of disease. We present MuSiC, a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. By appropriate weighting of genes showing cross-subject and cross-cell consistency, MuSiC enables the transfer of cell type-specific gene expression information from one dataset to another. When applied to pancreatic islet and whole kidney expression data in human, mouse, and rats, MuSiC outperformed existing methods, especially for tissues with closely related cell types. MuSiC enables the characterization of cellular heterogeneity of complex tissues for understanding of disease mechanisms. As bulk tissue data are more easily accessible than single-cell RNA-seq, MuSiC allows the utilization of the vast amounts of disease relevant bulk tissue RNA-seq data for elucidating cell type contributions in disease.

Highlights

  • Knowledge of cell type composition in disease relevant tissues is an important step towards the identification of cellular targets of disease

  • MUlti-Subject SIngle Cell deconvolution (MuSiC) starts with multi-subject scRNA-seq data, and assumes that the cells for each subject have been classified into a set of fixed cell types that are shared across subjects

  • MuSiC deconvolves bulk RNA sequencing (RNA-seq) samples to obtain the proportions of these cell types in each sample

Read more

Summary

Introduction

Knowledge of cell type composition in disease relevant tissues is an important step towards the identification of cellular targets of disease. We present MuSiC, a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. Computational methods have been developed to deconvolve cell type proportions using cell type-specific gene expression references[2]. TIMER5, developed for cancer data, focuses on the quantification of immune cell infiltration These methods rely on pre-selected cell type-specific marker genes, and are sensitive to the choice of significance threshold. These methods ignore cross-subject heterogeneity in cell type-specific gene expression as well as within-cell type stochasticity of single-cell gene expression, both of which cannot be ignored based on our analysis of multiple scRNA-seq datasets (Supplementary Figure 1a)

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call