Abstract

High-throughput sequencing technologies have enabled the generation of single-cell RNA-seq (scRNA-seq) data, which explore both genetic heterogeneity and phenotypic variation between cells. Some methods have been proposed to detect the related genes causing cell-to-cell variability for understanding tumor heterogeneity. However, most existing methods detect the related genes separately, without considering gene interactions. In this paper, we proposed a novel learning framework to detect the interactive gene groups for scRNA-seq data based on co-expression network analysis and subgraph learning. We first utilized spectral clustering to identify the subpopulations of cells. For each cell subpopulation, the differentially expressed genes were then selected to construct a gene co-expression network. Finally, the interactive gene groups were detected by learning the dense subgraphs embedded in the gene co-expression networks. We applied the proposed learning framework on a real cancer scRNA-seq dataset to detect interactive gene groups of different cancer subtypes. Systematic gene ontology enrichment analysis was performed to examine the detected genes groups by summarizing the key biological processes and pathways. Our analysis shows that different subtypes exhibit distinct gene co-expression networks and interactive gene groups with different functional enrichment. The interactive genes are expected to yield important references for understanding tumor heterogeneity.

Highlights

  • Recent advances in Next-generation sequencing (NGS) technologies have enabled the generation of high-throughput single-cell gene expression data exploring both genetic heterogeneity and phenotypic variation between cells [1,2]

  • The interactive gene groups were detected by learning the dense subgraphs embedded in the gene co-expression networks

  • We proposed a novel learning framework to detect the interactive genes for scRNA-seq data based on co-expression network analysis and subgraph learning

Read more

Summary

Introduction

Recent advances in Next-generation sequencing (NGS) technologies have enabled the generation of high-throughput single-cell gene expression data exploring both genetic heterogeneity and phenotypic variation between cells [1,2]. Single-cell RNA-seq (scRNA-seq) acquires transcriptomic information from individual cells, providing a higher resolution of cellular differences and a better understanding of cell functions at genetic and cellular levels [3]. The unprecedented ability of measuring gene expression from individual cells holds enormous potential for detecting the clinically important tumor subpopulations and understanding tumor heterogeneity [6]. Since the limited number of samples may lead to overfitting due to the noisy genes [8], dimensionality reduction methods are usually carried out after counting normalization to avoid the curse of dimensionality, provide visual representations of the cellular composition within high-dimensional data

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.