Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery.

Niclas C Tan,Harold R Garner,Wayne G Fisher,Kevin P Rosenblatt

doi:10.1186/1471-2105-10-144

Abstract

BackgroundMass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied. We describe a multi-statistical analysis procedure that combines several independent computational methods. This approach capitalizes on the strengths of each to analyze the same high-resolution mass spectral data set to discover consensus differential mass peaks that should be robust biomarkers for distinguishing between disease states.ResultsThe proposed methodology was applied to a pilot narcolepsy study using logistic regression, hierarchical clustering, t-test, and CART. Consensus, differential mass peaks with high predictive power were identified across three of the four statistical platforms. Based on the diagnostic accuracy measures investigated, the performance of the consensus-peak model was a compromise between logistic regression and CART, which produced better models than hierarchical clustering and t-test. However, consensus peaks confer a higher level of confidence in their ability to distinguish between disease states since they do not represent peaks that are a result of biases to a particular statistical algorithm. Instead, they were selected as differential across differing data distribution assumptions, demonstrating their true discriminatory potential.ConclusionThe methodology described here is applicable to any high-resolution MALDI mass spectrometry-derived data set with minimal mass drift which is essential for peak-to-peak comparison studies. Four statistical approaches with differing data distribution assumptions were applied to the same raw data set to obtain consensus peaks that were found to be statistically differential between the two groups compared. These consensus peaks demonstrated high diagnostic accuracy when used to form a predictive model as evaluated by receiver operating characteristics curve analysis. They should demonstrate a higher discriminatory ability as they are not biased to a particular algorithm. Thus, they are prime candidates for downstream identification and validation efforts.

Highlights

Mass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied
Biomarker Selection Logistic Regression We used our modified, Akaike Information Criterion (AIC)-optimal logistic regression protocol to analyze the narcolepsy data set and compared the diagnostic power of the best model from this approach to the best model obtained using the default single-step calling of the PROC LOGISTIC in Statistical Analysis Software (SAS)
We have applied four distinct statistical approaches to the same high-resolution mass spectral data set from our narcolepsy study to discover mass peaks that are statistically differential

Summary

Introduction

Mass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied. We describe a multi-statistical analysis procedure that combines several independent computational methods. This approach capitalizes on the strengths of each to analyze the same high-resolution mass spectral data set to discover consensus differential mass peaks that should be robust biomarkers for distinguishing between disease states. The ultimate goal is reproducibility of the mass spectra across replicates and the alignment of peaks across samples. To this end, next-generation mass spectrometers with high mass accuracy have been employed, along with efforts to standardize sample collection and processing protocols [3,4]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: May 14, 2009
Citations: 32	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Abstract 1475: Plasma circulating miRNAs: a new potential biomarker for prostate cancer diagnosis
Simona Giglio ... Roberto Cirombella
Cancer Research | VOL. 74
Simona Giglio, et. al.Simona Giglio ... Roberto Cirombella
30 Sep 2014
Abstract 1475: Plasma circulating miRNAs: a new potential biomarker for prostate cancer diagnosis
Simona Giglio ... Roberto Cirombella

Identification of fluorescence in situ hybridization assay markers for prediction of disease progression in prostate cancer patients on active surveillance
Katerina Pestova ... Huixin Fei
BMC Cancer | VOL. 18
Katerina Pestova, et. al.Katerina Pestova ... Huixin Fei
02 Jan 2018
BMC Cancer | VOL. 18

ChatGPT-Enhanced ROC Analysis (CERA): A shiny web tool for finding optimal cutoff points in biomarker analysis.
Melih Agraz ... Christos Mantzoros
PLOS ONE | VOL. 19
Melih Agraz, et. al.Melih Agraz ... Christos Mantzoros
10 Apr 2024
PLOS ONE | VOL. 19

The added value of MRI in distinguishing malignant and benign ampullary strictures: a multicenter retrospective study.
Ji Eun Lee ... Ji Eun Moon
Japanese journal of radiology | VOL. -
Ji Eun Lee, et. al.Ji Eun Lee ... Ji Eun Moon
26 Sep 2024
Japanese journal of radiology | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics