Abstract

Tandem mass tag (TMT) is a multiplexing technology widely-used in proteomic research. It enables relative quantification of proteins from multiple biological samples in a single MS run with high efficiency and high throughput. However, experiments often require more biological replicates or conditions than can be accommodated by a single run, and involve multiple TMT mixtures and multiple runs. Such larger-scale experiments combine sources of biological and technical variation in patterns that are complex, unique to TMT-based workflows, and challenging for the downstream statistical analysis. These patterns cannot be adequately characterized by statistical methods designed for other technologies, such as label-free proteomics or transcriptomics. This manuscript proposes a general statistical approach for relative protein quantification in MS- based experiments with TMT labeling. It is applicable to experiments with multiple conditions, multiple biological replicate runs and multiple technical replicate runs, and unbalanced designs. It is based on a flexible family of linear mixed-effects models that handle complex patterns of technical artifacts and missing values. The approach is implemented in MSstatsTMT, a freely available open-source R/Bioconductor package compatible with data processing tools such as Proteome Discoverer, MaxQuant, OpenMS, and SpectroMine. Evaluation on a controlled mixture, simulated datasets, and three biological investigations with diverse designs demonstrated that MSstatsTMT balanced the sensitivity and the specificity of detecting differentially abundant proteins, in large-scale experiments with multiple biological mixtures.

Highlights

  • We considered the empirical false discovery rate (eFDR = false positives (FP)/ (TP1FP)), the sensitivity (TP/(TP1FN)) and the specificity true negatives (TN)/ (TN1FP) of detecting differentially abundant proteins among the testable proteins at the FDR = 0.05 cutoff

  • One-way Analysis of Variance (ANOVA) was less sensitive than MSstatsTMT, the discrepancy became smaller with the increase of sample size. edgeR filtered out proteins with missing summaries, and reported the smallest number of testable proteins, true positive and false positive differentially abundant proteins

  • Evaluation on Biological Investigations with Diverse Designs— We evaluated MSstatsTMT in three biological investigations, each illustrating different challenges related to their experimental designs

Read more

Summary

Graphical Abstract

Highlights Statistical approach for differential abundance analysis for proteomic experiments with TMT labeling. Strategies for addressing these challenges have been proposed, analysis of these experiments remains challenging This manuscript proposes a general statistical approach for relative protein quantification in MS-based experiments with TMT labeling, and is designed for experiments with multiple conditions, multiple biological replicate runs, multiple technical replicate runs, and unbalanced designs. It is based on a flexible family of linear mixed-effects models that handle complex patterns of technical artifacts and missing values. We present the details of the approach, as well as its evaluation on a controlled mixture, simulated data sets, and three biological investigations

EXPERIMENTAL PROCEDURES
RESULTS
Method
Evaluation
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call