Abstract

An important challenge in pre-processing data from droplet-based single-cell RNA sequencing protocols is distinguishing barcodes associated with real cells from those binding background reads. Existing methods test barcodes individually and consequently do not leverage the strong cell-to-cell correlation present in most datasets. To improve cell detection, we introduce CB2, a cluster-based approach for distinguishing real cells from background barcodes. As demonstrated in simulated and case study datasets, CB2 has increased power for identifying real cells which allows for the identification of novel subpopulations and improves the precision of downstream analyses.

Highlights

  • Droplet-based single-cell RNA sequencing [1] is a powerful and widely used approach for profiling genome-wide gene expression in individual cells

  • Current commercial droplet-based technologies utilize gel beads [2], each containing oligonucleotide indexes made up of bead-specific barcodes combined with unique molecular identifiers (UMIs) [3] and oligo-dT tags to prime polyadenylated RNA

  • Indexed cDNA is pooled for PCR amplification and sequencing resulting in a data matrix of UMI counts for each barcode (Additional file 1: Figure S1)

Read more

Summary

Introduction

Droplet-based single-cell RNA sequencing (scRNA-seq) [1] is a powerful and widely used approach for profiling genome-wide gene expression in individual cells. Current commercial droplet-based technologies utilize gel beads [2], each containing oligonucleotide indexes made up of bead-specific barcodes combined with unique molecular identifiers (UMIs) [3] and oligo-dT tags to prime polyadenylated RNA. Single cells of interest are combined with reagents in one channel of a microfluidic chip, and gel beads in another, to form gel beads in emulsion, or GEMs. Oligonucleotide indexes bind polyadenylated RNA within each GEM reaction vesicle before gel beads are dissolved releasing the bound oligos into solution for reverse transcription. Each resulting cDNA molecule contains a UMI and a GEM-specific barcode. Indexed cDNA is pooled for PCR amplification and sequencing resulting in a data matrix of UMI counts for each barcode (Additional file 1: Figure S1)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call