Abstract

Quantitative proteomics by mass spectrometry is widely used in biomarker research and basic biology research for investigation of phenotype level cellular events. Despite the wide application, the methodology for statistical analysis of differentially expressed proteins has not been unified. Various methods such as t test, linear model and mixed effect models are used to define changes in proteomics experiments. However, none of these methods consider the specific structure of MS-data. Choices between methods, often originally developed for other types of data, are based on compromises between features such as statistical power, general applicability and user friendliness. Furthermore, whether to include proteins identified with one peptide in statistical analysis of differential protein expression varies between studies. Here we present DEqMS, a robust statistical method developed specifically for differential protein expression analysis in mass spectrometry data. In all data sets investigated there is a clear dependence of variance on the number of PSMs or peptides used for protein quantification. DEqMS takes this feature into account when assessing differential protein expression. This allows for a more accurate data-dependent estimation of protein variance and inclusion of single peptide identifications without increasing false discoveries. The method was tested in several data sets including E. coli proteome spike-in data, using both label-free and TMT-labeled quantification. Compared with previous statistical methods used in quantitative proteomics, DEqMS showed consistently better accuracy in detecting altered protein levels compared with other statistical methods in both label-free and labeled quantitative proteomics data. DEqMS is available as an R package in Bioconductor.

Highlights

  • Quantitative proteomics by mass spectrometry is widely used in biomarker research and basic biology research for investigation of phenotype level cellular events

  • DEqMS is a statistical method for identification of differentially expressed proteins in Mass spectrometry (MS)-data, that takes into acount the dependence of variance on the number of peptide spectrum matches (PSMs) or peptides used for protein quantification

  • Dependence of Protein Variance On the Number of Quantified PSMs or Peptides—To demonstrate the dependence of variance on the number of quantified PSMs we used an in-depth proteomics data set (TMT 10-plex labeled) of A431 cells sampled at different time points after treatment with the standard deviation variance

Read more

Summary

Introduction

Quantitative proteomics by mass spectrometry is widely used in biomarker research and basic biology research for investigation of phenotype level cellular events. The methodology for statistical analysis of differentially expressed proteins has not been unified Various methods such as t test, linear model and mixed effect models are used to define changes in proteomics experiments. We present DEqMS, a robust statistical method developed for differential protein expression analysis in mass spectrometry data. DEqMS takes this feature into account when assessing differential protein expression This allows for a more accurate data-dependent estimation of protein variance and inclusion of single peptide identifications without increasing false discoveries. In previous quantitative proteomics analysis, Student t test, ANOVA [2], Limma [3] and linear mixed models (2, 4 – 6), have been used to detect differentially expressed proteins (DEPs). From the ‡Department of Oncology-Pathology, Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden; §Institute for Molecular Medicine, University of Helsinki, Helsinki, Finland; ¶Centre for Molecular Biology of Heidelberg University (ZMBH), Heidelberg, Germany

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call