Abstract

Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.

Highlights

  • Mass cytometry allows high-resolution dissection of the cellular composition of the immune system

  • For a given high-dimensional data set such as the three-dimensional illustrative example in Fig. 1a, HSNE13 builds a hierarchy of local neighborhoods in this high-dimensional space, starting with the raw data that, subsequently, is aggregated at more abstract hierarchical levels

  • Mass cytometry data sets generally consist of millions of cells

Read more

Summary

Introduction

Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. The linear nature of PCA renders it unsuitable to dissect the non-linear relationships in the mass cytometry data, while the non-linear methods (t-SNE8 and Diffusion maps10) do retain local data structure, but are limited by the number of cells that can be analyzed. This limit is imposed by a computational burden but, more importantly, by local neighborhoods becoming too crowded in the high-dimensional space, resulting in overplotting and presenting misleading information in the visualization. We adapted Hierarchical stochastic neighbor embedding (HSNE)[13] that was recently introduced for the analysis of hyperspectral satellite imaging data to the analysis of mass a b

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call