Abstract

BackgroundTandem mass spectrometry followed by database search is currently the predominant technology for peptide sequencing in shotgun proteomics experiments. Most methods compare experimentally observed spectra to the theoretical spectra predicted from the sequences in protein databases. There is a growing interest, however, in comparing unknown experimental spectra to a library of previously identified spectra. This approach has the advantage of taking into account instrument-dependent factors and peptide-specific differences in fragmentation probabilities. It is also computationally more efficient for high-throughput proteomics studies.ResultsThis paper investigates computational issues related to this spectral comparison approach. Different methods have been empirically evaluated over several large sets of spectra. First, we illustrate that the peak intensities follow a Poisson distribution. This implies that applying a square root transform will optimally stabilize the peak intensity variance. Our results show that the square root did indeed outperform other transforms, resulting in improved accuracy of spectral matching. Second, different measures of spectral similarity were compared, and the results illustrated that the correlation coefficient was most robust. Finally, we examine how to assemble multiple spectra associated with the same peptide to generate a synthetic reference spectrum. Ensemble averaging is shown to provide the best combination of accuracy and efficiency.ConclusionOur results demonstrate that when combined, these methods can boost the sensitivity and specificity of spectral comparison. Therefore they are capable of enhancing and complementing existing tools for consistent and accurate peptide identification.

Highlights

  • Tandem mass spectrometry followed by database search is currently the predominant technology for peptide sequencing in shotgun proteomics experiments

  • Tandem mass spectrometry paired with advanced liquid chromatography has emerged as the standard technique for high throughput protein identification [1,2]

  • Database search is currently the prevailing approach to sequence peptides from MS/MS spectra. This approach is often compromised by a large number of unassigned spectra because the fragmentation process is both peptide-specific and instrument-dependent

Read more

Summary

Introduction

Tandem mass spectrometry followed by database search is currently the predominant technology for peptide sequencing in shotgun proteomics experiments. There is a growing interest, in comparing unknown experimental spectra to a library of previously identified spectra This approach has the advantage of taking into account instrument-dependent factors and peptide-specific differences in fragmentation probabilities. It is computationally more efficient for high-throughput proteomics studies. Tandem mass spectrometry paired with advanced liquid chromatography has emerged as the standard technique for high throughput protein identification [1,2]. This shotgun technology does not require the initial separation of individual proteins and can be applied to complex mixtures. The identified peptides are grouped together to determine the underlying proteins

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.