Protein complex-based analysis is resistant to the obfuscating consequences of batch effects --- a case study in clinical proteomics

Wilson Wen Bin Goh,Limsoon Wong

doi:10.1186/s12864-017-3490-3

Abstract

BackgroundIn proteomics, batch effects are technical sources of variation that confounds proper analysis, preventing effective deployment in clinical and translational research.ResultsUsing simulated and real data, we demonstrate existing batch effect-correction methods do not always eradicate all batch effects. Worse still, they may alter data integrity, and introduce false positives. Moreover, although Principal component analysis (PCA) is commonly used for detecting batch effects. The principal components (PCs) themselves may be used as differential features, from which relevant differential proteins may be effectively traced. Batch effect are removable by identifying PCs highly correlated with batch but not class effect.However, neither PC-based nor existing batch effect-correction methods address well subtle batch effects, which are difficult to eradicate, and involve data transformation and/or projection which is error-prone. To address this, we introduce the concept of batch-effect resistant methods and demonstrate how such methods incorporating protein complexes are particularly resistant to batch effect without compromising data integrity.ConclusionsProtein complex-based analyses are powerful, offering unparalleled differential protein-selection reproducibility and high prediction accuracy. We demonstrate for the first time their innate resistance against batch effects, even subtle ones. As complex-based analyses require no prior data transformation (e.g. batch-effect correction), data integrity is protected. Individual checks on top-ranked protein complexes confirm strong association with phenotype classes and not batch. Therefore, the constituent proteins of these complexes are more likely to be clinically relevant.

Highlights

In proteomics, batch effects are technical sources of variation that confounds proper analysis, preventing effective deployment in clinical and translational research
Batch effects cannot be completely eradicated via batch effect-correction algorithms Our method simulates batch effects in the following manner (Fig. 1a): In the first dimension, class-effect sizes are inserted based on the method of Langley and Mayr to distinguish classes D and D* [24]
Using two examples (D2.2. 301H and 302H), we show that removal of the first principal components (PCs) (PC1) allows samples to cluster based on classes rather than batch (Fig. 4a)

Summary

Introduction

Batch effects are technical sources of variation that confounds proper analysis, preventing effective deployment in clinical and translational research. The emergence of high-performance protein-extraction procedures (e.g., PCT [1]), brute-force spectra-capture methods (e.g., SWATH [2]), and improved multiplexing technologies [3] has transformed proteomics (the highthroughput expressional study of proteins) from a relatively low-throughput technology to one with critical practical applications in biology. The application of proteomics on clinical samples (i.e., clinical proteomics) is concerned with unraveling proteome changes associated with disease using actual clinical samples. Statistics provides powerful means for differential protein selection based on the hypothesis-testing framework. This process is commonly referred to as “feature selection” (where a feature is a protein in this instance; see Methods for details on feature selection)

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Mar 1, 2017
Citations: 20	License type: open-access

R Discovery Prime

R Discovery Prime

Protein complex-based analysis is resistant to the obfuscating consequences of batch effects --- a case study in clinical proteomics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Detecting and Correcting Batch Effects in High-Throughput Genomic Experiments

-

12 Jul 2014
12 Jul 2014

A Novel Statistical Method to Diagnose, Quantify and Correct Batch Effects in Genomic Studies
Gift Nyamundanda ... Pawan Poudel
Scientific Reports | VOL. 7
Gift Nyamundanda, et. al.Gift Nyamundanda ... Pawan Poudel
07 Sep 2017
Scientific Reports | VOL. 7

Perspectives for better batch effect correction in mass-spectrometry-based proteomics
Ser-Xian Phua ... Wilson Wen-Bin Goh
Computational and structural biotechnology journal | VOL. 20
Ser-Xian Phua, et. al.Ser-Xian Phua ... Wilson Wen-Bin Goh
01 Jan 2021
Computational and structural biotechnology journal | VOL. 20

Abstract 893: Batch effects in tumor biomarker studies using tissue microarrays: Extent, impact, and remediation
Konrad H Stopsack ... J Bailey Vaselkiv
Cancer Research | VOL. 81
Konrad H Stopsack, et. al.Konrad H Stopsack ... J Bailey Vaselkiv
01 Jul 2021
Cancer Research | VOL. 81

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Protein complex-based analysis is resistant to the obfuscating consequences of batch effects --- a case study in clinical proteomics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics