Inference after latent variable estimation for single-cell RNA sequencing data.

Anna Neufeld,Daniela Witten,Joshua Popp,Alexis Battle,Lucy L Gao

doi:10.1093/biostatistics/kxac047

Anna Neufeld, Daniela Witten + Show 3 more

Open Access

https://doi.org/10.1093/biostatistics/kxac047

Copy DOI

Abstract

In the analysis of single-cell RNA sequencing data, researchers often characterize the variation between cells by estimating a latent variable, such as cell type or pseudotime, representing some aspect of the cell's state. They then test each gene for association with the estimated latent variable. If the same data are used for both of these steps, then standard methods for computing p-values in the second step will fail to achieve statistical guarantees such as Type 1 error control. Furthermore, approaches such as sample splitting that can be applied to solve similar problems in other settings are not applicable in this context. In this article, we introduce count splitting, a flexible framework that allows us to carry out valid inference in this setting, for virtually any latent variable estimation technique and inference approach, under a Poisson assumption. We demonstrate the Type 1 error control and power of count splitting in a simulation study and apply count splitting to a data set of pluripotent stem cells differentiating to cardiomyocytes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Inference after latent variable estimation for single-cell RNA sequencing data.

Abstract

Talk to us

Similar Papers

More From: Biostatistics

Lead the way for us

Journal: Biostatistics	Publication Date: Dec 13, 2022
Citations: 27

Similar Papers

Key Cell Types and Biomarkers in Heart Failure Identified through Analysis of Single-Cell and Bulk RNA Sequencing Data.
Ying Kong ... Ruiting Huang
Mediators of Inflammation | VOL. 2023
Ying Kong, et. al.Ying Kong ... Ruiting Huang
26 Dec 2023
Mediators of Inflammation | VOL. 2023

Integrative analysis of bulk and single-cell RNA sequencing data reveals distinct subtypes of MAFLD based on N1-methyladenosine regulator expression
Jinyong He ... Cong Du
Liver Research | VOL. 7
Jinyong He, et. al.Jinyong He ... Cong Du
01 Jun 2023
Liver Research | VOL. 7

A Regularized Multi-Task Learning Approach for Cell Type Detection in Single-Cell RNA Sequencing Data.
Piu Upadhyay ... Sumanta Ray
Frontiers in genetics | VOL. 13
Piu Upadhyay, et. al.Piu Upadhyay ... Sumanta Ray
13 Apr 2022
Frontiers in genetics | VOL. 13

Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data.
Cheng Jia ... Yu Hu
Nucleic Acids Research | VOL. 45
Cheng Jia, et. al.Cheng Jia ... Yu Hu
25 Sep 2017
Nucleic Acids Research | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Inference after latent variable estimation for single-cell RNA sequencing data.

Abstract

Talk to us

Similar Papers

More From: Biostatistics