Abstract

High-dimensional flow and mass cytometry allow cell types and states to be characterized in great detail by measuring expression levels of more than 40 targeted protein markers per cell at the single-cell level. However, data analysis can be difficult, due to the large size and dimensionality of datasets as well as limitations of existing computational methods. Here, we present diffcyt, a new computational framework for differential discovery analyses in high-dimensional cytometry data, based on a combination of high-resolution clustering and empirical Bayes moderated tests adapted from transcriptomics. Our approach provides improved statistical performance, including for rare cell populations, along with flexible experimental designs and fast runtimes in an open-source framework.

Highlights

  • High-dimensional flow and mass cytometry allow cell types and states to be characterized in great detail by measuring expression levels of more than 40 targeted protein markers per cell at the single-cell level

  • Several new methods have recently been developed for performing supervised analyses with the aim of inferring cell populations or states associated with an outcome variable in high-dimensional cytometry data, including Citrus[9], CellCnn[10], cydar[11], and a classic regression-based approach[12]

  • In particular: detected features from Citrus cannot be ranked by importance, and the ranking of detected cells from CellCnn cannot be interpreted in terms of statistical significance; rare cell populations are difficult to detect with Citrus and cydar; the response variable in the models for Citrus and CellCnn is the outcome variable, which makes it difficult to account for complex experimental designs; and CellCnn and cydar do not distinguish between “cell type” and “cell state” markers, which can make interpretation difficult

Read more

Summary

Introduction

High-dimensional flow and mass cytometry allow cell types and states to be characterized in great detail by measuring expression levels of more than 40 targeted protein markers per cell at the single-cell level. Several new methods have recently been developed for performing (partially) supervised analyses with the aim of inferring cell populations or states associated with an outcome variable in high-dimensional cytometry data, including Citrus[9], CellCnn[10], cydar[11], and a classic regression-based approach[12] In particular: detected features from Citrus cannot be ranked by importance, and the ranking of detected cells from CellCnn cannot be interpreted in terms of statistical significance; rare cell populations are difficult to detect with Citrus and cydar (by contrast, CellCnn is optimized for analysis of rare populations); the response variable in the models for Citrus and CellCnn is the outcome variable, which makes it difficult to account for complex experimental designs; and CellCnn and cydar do not distinguish between “cell type” and “cell state” (e.g. functional) markers, which can make interpretation difficult

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call