Abstract

ABSTRACT Introduction Mass spectrometry-based proteomics is actively embracing quantitative, single-cell level analyses. Indeed, recent advances in sample preparation and mass spectrometry (MS) have enabled the emergence of quantitative MS-based single-cell proteomics (SCP). While exciting and promising, SCP still has many rough edges. The current analysis workflows are custom and built from scratch. The field is therefore craving for standardized software that promotes principled and reproducible SCP data analyses. Areas covered This special report is the first step toward the formalization and standardization of SCP data analysis. scp, the software that accompanies this work, successfully replicates one of the landmark SCP studies and is applicable to other experiments and designs. We created a repository containing the replicated workflow with comprehensive documentation in order to favor further dissemination and improvements of SCP data analyses. Expert opinion Replicating SCP data analyses uncovers important challenges in SCP data analysis. We describe two such challenges in detail: batch correction and data missingness. We provide the current state-of-the-art and illustrate the associated limitations. We also highlight the intimate dependence that exists between batch effects and data missingness and offer avenues for dealing with these exciting challenges.

Highlights

  • Mass spectrometry-based proteomics is actively embracing quantitative, singlecell level analyses

  • Mass spectrometry (MS)-based approaches to study the proteome of single cells are emerging, using the wide range of possibilities offered by the technology, including miniaturized and automated sample preparation, labeled and label-free quantitation, as well as data dependent and independent approaches [1, 2, 3, 4, 5]

  • We present the replication of the open-source scripts of SCoPE2 published by Specht and colleagues and their implementation as a formal R/Bioconductor package named scp [14, 15]

Read more

Summary

Introduction

High-throughput single-cell assays are instrumental in highlighting the biology of heterogeneous cell populations, tissues and cell differentiation processes. Mass spectrometry (MS)-based approaches to study the proteome of single cells are emerging, using the wide range of possibilities offered by the technology, including miniaturized and automated sample preparation, labeled and label-free quantitation, as well as data dependent and independent approaches [1, 2, 3, 4, 5]. We focus on the processing of mass spectrometry-based single cell quantitative data, as produced from the raw data using widely used tools such as, for example, MaxQuant [6] or Proteome Discoverer (Thermo Fisher Scientific). We have chosen to focus on the replication of the SCoPE2 analysis for several reasons It put a milestone in the SCP field by reporting the acquisition of over a thousand single cells and proving that SCP has reached its potential in becoming a high-throughput technology [16].

Conclusion
Batch correction
Data missingness
Findings
Batch effects and data missingness are not independent
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call