Abstract

Cell–cell interactions (CCIs) and cell–cell communication (CCC) are critical for maintaining complex biological systems. The availability of single-cell RNA sequencing (scRNA-seq) data opens new avenues for deciphering CCIs and CCCs through identifying ligand-receptor (LR) gene interactions between cells. However, most methods were developed to examine the LR interactions of individual pairs of genes. Here, we propose a novel approach named LR hunting which first uses random forests (RFs)-based data imputation technique to link the data between different cell types. To guarantee the robustness of the data imputation procedure, we repeat the computation procedures multiple times to generate aggregated imputed minimal depth index (IMDI). Next, we identify significant LR interactions among all combinations of LR pairs simultaneously using unsupervised RFs. We demonstrated LR hunting can recover biological meaningful CCIs using a mouse cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) dataset and a triple-negative breast cancer scRNA-seq dataset.

Highlights

  • In recent years, single-cell RNA sequencing has been widely applied to measure gene expression at single-cell resolution, and has become a powerful tool to detect common and rare cell subpopulations, construct cell lineage and pseudotime, and identify spatial gene expression pattern, etc

  • To better capture the complex relationships between LR interactions, here we propose a new multivariate cell communication (CCC) analysis approach based on random forests (RFs), which incorporates the correlations and interactions among intercellular networks to rank and prioritize the LR interactions

  • We analyzed scRNA-seq data in a multivariate framework to identify the complex interactions between genes in different cell types and the gene pairs that are most significantly associated with each other

Read more

Summary

Introduction

Single-cell RNA sequencing (scRNA-seq) has been widely applied to measure gene expression at single-cell resolution, and has become a powerful tool to detect common and rare cell subpopulations, construct cell lineage and pseudotime, and identify spatial gene expression pattern, etc. While there still are many open problems and challenges remaining, scRNA-seq data analysis can be further expanded and developed to fully utilized the data for better understanding the cell heterogeneity and gene expression stochasticity (Lahnemann et al, 2020). The availability of scRNA-seq data provides the great opportunities to decipher the CCIs and CCC through ligand-receptor (LR) gene expressions (Shao et al, 2020; Liu et al, 2021). Several analysis tools have been developed to infer CCC by modeling the LR co-expression data including Spearman correlation between LRs (Zhou et al, 2017; Cohen et al, 2018), product-based score from gene expression of LR pair (Kumar et al, 2018; Cabello-Aguilar et al, 2020; Hu et al, 2021), differential gene combinations (Tyler et al, 2019; Cillo et al, 2020), gene expression permutation test (Efremova et al, 2020; Dries et al, 2021; Noel et al, 2021)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call