Abstract

The emergence of single-cell RNA-seq (scRNA-seq) technology has made it possible to measure gene expression variations at cellular level. This breakthrough enables the investigation of a wider range of problems including analysis of splicing heterogeneity among individual cells. However, compared to bulk RNA-seq, scRNA-seq data are much noisier due to high technical variability and low sequencing depth. Here we propose SCATS (Single-Cell Analysis of Transcript Splicing) for differential splicing analysis in scRNA-seq, which achieves high sensitivity at low coverage by accounting for technical noise. SCATS models scRNA-seq data either with or without Unique Molecular Identifiers (UMIs). For non-UMI data, SCATS explicitly models technical noise by accounting for capture efficiency and amplification bias through the use of external spike-ins; for UMI data, SCATS models capture efficiency and further accounts for transcriptional burstiness. A key aspect of SCATS lies in its ability to group “exons” that originate from the same isoform(s). Grouping exons is essential in splicing analysis of scRNA-seq data as it naturally aggregates spliced reads across different exons, making it possible to detect splicing events even when sequencing depth is low. To evaluate the performance of SCATS, we analyzed both simulated and real scRNA-seq datasets and compared with existing methods including Census and DEXSeq. We show that SCATS has well controlled type I error rate, and is more powerful than existing methods, especially when splicing difference is small. In contrast, Census suffers from severe type I error inflation, whereas DEXSeq is more conservative. When applied to mouse brain scRNA-seq datasets, SCATS identified more differential splicing events with subtle difference across cell types compared to Census and DEXSeq. With the increasing adoption of scRNA-seq, we believe SCATS will be well-suited for various splicing studies. The implementation of SCATS can be downloaded from https://github.com/huyustats/SCATS.

Highlights

  • The emergence of scRNA-seq technology has made it possible to measure gene expression variations at cellular level

  • Methods developed for bulk RNA-seq may not be optimal when analyzing data generated from scRNA-seq experiments

  • To fill in this gap, we developed SCATS, an open-source software package, which allows analysis of scRNA-seq data with or without Unique Molecular Identifiers (UMIs)

Read more

Summary

Introduction

The emergence of scRNA-seq technology has made it possible to measure gene expression variations at cellular level. This breakthrough enables the investigation of a wide range of problems including analysis of splicing heterogeneity among individual cells. Compared to bulk RNA-seq, scRNA-seq data are much noisier due to high technical variability, low sequencing depth, and the lack of full-length transcript sequencing for droplet-based protocols. Despite the growing popularity of scRNA-seq, few published studies have investigated alternative splicing, and even when studied, methods developed for bulk RNA-seq were utilized [1,2,3], which may not be optimal for scRNA-seq data

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call