Abstract

Despite its popularity, characterization of subpopulations with transcript abundance is subject to a significant amount of noise. We propose to use effective and expressed nucleotide variations (eeSNVs) from scRNA-seq as alternative features for tumor subpopulation identification. We develop a linear modeling framework, SSrGE, to link eeSNVs associated with gene expression. In all the datasets tested, eeSNVs achieve better accuracies than gene expression for identifying subpopulations. Previously validated cancer-relevant genes are also highly ranked, confirming the significance of the method. Moreover, SSrGE is capable of analyzing coupled DNA-seq and RNA-seq data from the same single cells, demonstrating its value in integrating multi-omics single cell techniques. In summary, SNV features from scRNA-seq data have merits for both subpopulation identification and linkage of genotype-phenotype relationship.

Highlights

  • Despite its popularity, characterization of subpopulations with transcript abundance is subject to a significant amount of noise

  • The results show that bipartite graph is a robust and more discriminative alternative (Fig. 3), comparing to principal component analysis (PCA) plots as well as SIMLR

  • They provide a means to examine the relationship between expressed nucleotide variations (eeSNVs) and gene expression in the same scRNA-seq sample

Read more

Summary

Introduction

Characterization of subpopulations with transcript abundance is subject to a significant amount of noise. We propose to use effective and expressed nucleotide variations (eeSNVs) from scRNA-seq as alternative features for tumor subpopulation identification. In scRNA-seq data, patterns of gene expression (GE) are conventionally used as features to explore the heterogeneity among single cells[1,2,3]. The expression of particular genes varies with cell cycle[6], increasing the heterogeneity observed in single cells[7] To cope with these sources of variations, normalization of GE is usually a mandatory step before downstream functional analysis[7]. Rather than being considered the byproducts of scRNA-seq, the SNVs have the potential to improve the accuracy of identifying subpopulations compared to GE, and offer unique opportunities to study the genetic events (genotype) associated with gene expression (phenotype) 17,18. We emphasize that extracting SNV from scRNA-seq analysis can successfully identify subpopulation complexity and highlight genotype–phenotype relationships

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.