MsmsEval: tandem mass spectral quality assignment for high-throughput proteomics.

Jason Wh Wong,Matthew J Sullivan,Hugh M Cartwright,Gerard Cagney

doi:10.1186/1471-2105-8-51

Jason Wh Wong, Matthew J Sullivan + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-8-51

Copy DOI

Abstract

BackgroundIn proteomics experiments, database-search programs are the method of choice for protein identification from tandem mass spectra. As amino acid sequence databases grow however, computing resources required for these programs have become prohibitive, particularly in searches for modified proteins. Recently, methods to limit the number of spectra to be searched based on spectral quality have been proposed by different research groups, but rankings of spectral quality have thus far been based on arbitrary cut-off values. In this work, we develop a more readily interpretable spectral quality statistic by providing probability values for the likelihood that spectra will be identifiable.ResultsWe describe an application, msmsEval, that builds on previous work by statistically modeling the spectral quality discriminant function using a Gaussian mixture model. This allows a researcher to filter spectra based on the probability that a spectrum will ultimately be identified by database searching. We show that spectra that are predicted by msmsEval to be of high quality, yet remain unidentified in standard database searches, are candidates for more intensive search strategies. Using a well studied public dataset we also show that a high proportion (83.9%) of the spectra predicted by msmsEval to be of high quality but that elude standard search strategies, are in fact interpretable.ConclusionmsmsEval will be useful for high-throughput proteomics projects and is freely available for download from . Supports Windows, Mac OS X and Linux/Unix operating systems.

Highlights

In proteomics experiments, database-search programs are the method of choice for protein identification from tandem mass spectra
The introduction of orthogonal peptide separation techniques coupled to the mass spectrometer, such as multidimensional protein identification technology (MudPIT) [2] and combined fractional diagonal chromatography (COFRADIC) [3], has significantly increased the potential throughput of tandem mass spectrometry experiments, enabling the identification of 100s or 1000s of
We show that our assigned probability is a good estimate of the observed value and is of practical use in a proteomics lab using different instrument platforms or different types of experimental samples. msmsEval is useful for reducing search processing time and for selecting high quality unidentified spectra for further assessment

Summary

Introduction

Database-search programs are the method of choice for protein identification from tandem mass spectra. BMC Bioinformatics 2007, 8:51 http://www.biomedcentral.com/1471-2105/8/51 proteins from a single sample This potential has not been fully realized because the vast amount of primary data generates computational burdens, notably time-consuming and processor-intensive tandem mass spectra interpretation. The most widely-used interpretation programs, such as SEQUEST [4], X!Tandem [5] and Mascot [6], use amino acid sequence databases that are expanding in size daily. Heuristic programs such as X!Tandem [5] and PFSM [7] have been reported to reduce search times by 80–90%. Search time would grow exponentially if the search space is increased to account for all possible modifications

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Feb 9, 2007
Citations: 69	License type: cc-by

R Discovery Prime

R Discovery Prime

MsmsEval: tandem mass spectral quality assignment for high-throughput proteomics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

A Simulated MS/MS Library for Spectrum-to-spectrum Searching in Large Scale Identification of Proteins
Chia-Yu Yen ... Katheryn A Resing
Molecular & Cellular Proteomics | VOL. 8
Chia-Yu Yen, et. al.Chia-Yu Yen ... Katheryn A Resing
01 Apr 2009
Molecular & Cellular Proteomics | VOL. 8

The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search
Sangtae Kim ... Pavel A Pevzner
Molecular & Cellular Proteomics | VOL. 9
Sangtae Kim, et. al.Sangtae Kim ... Pavel A Pevzner
01 Dec 2010
Molecular & Cellular Proteomics | VOL. 9

Peptide Identification by Database Search of Mixture Tandem Mass Spectra
Jian Wang ... Philip E Bourne
Molecular & Cellular Proteomics | VOL. 10
Jian Wang, et. al.Jian Wang ... Philip E Bourne
23 Aug 2011
Molecular & Cellular Proteomics | VOL. 10

Chapter 50 - Direct Database Searching Using Tandem Mass Spectra of Peptides
William Hayes Mcdonald ... J YATESIII
Cell Biology | VOL. -
William Hayes Mcdonald, et. al.William Hayes Mcdonald ... J YATESIII
01 Jan 2006
Cell Biology | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MsmsEval: tandem mass spectral quality assignment for high-throughput proteomics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics