Two-dimensional electrophoresis is a crucial method in proteomics that allows the characterization of proteins' function and expression. This usually implies the identification of proteins that are differentially expressed between two contrasting conditions, for example, healthy versus diseased in human proteomics biomarker discovery and stressful conditions versus control in animal experimentation. The statistical procedures that lead to such identifications are critical steps in the 2-DE analysis workflow. They include a normalization step and a test and probability correction for multiple testing. Statistical issues caused by the high dimensionality of the data and large-scale multiple testing have been a more active topic in transcriptomics than proteomics, especially in microarray analysis. We thus propose to adapt innovative statistical tools developed for microarray analysis and incorporate them in the 2-DE analysis pipeline. In this article, we evaluate the performance of different normalization procedures, different statistical tests and false discovery rate calculation methods with both real and simulated datasets. We demonstrate that the use of statistical procedures adapted from microarrays lead to notable increase in power as well as a minimization of false-positive discovery rate. More specifically, we obtained the best results in terms of reliability and sensibility when using the 'moderate t-test' from Smyth in association with classic false discovery rate from Benjamini and Hochberg. The methods discussed are freely available in the 'prot2D' open source R-package from Bioconductor (http://www.bioconductor.org//) under the terms of the GNU General Public License (version 2 or later). sebastien.artigaud@univ-brest.fr or sebastien.artigaud@gmx.com.
Read full abstract