Abstract

In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set.

Highlights

  • From the ‡Department of Computer Science and Engineering, University of California, San Diego, 9500, Gilman Drive, San Diego, California 92093; §Algorithmic Biology Laboratory, St

  • In the last two years, because of advances in protein separation and top-down instrumentation, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples containing hundreds and even thousands of proteins (16 –21)

  • Data Sets—Two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium were used for benchmarking: S. cerevisiae (SC) Data Set [16]—A lysate was quickly extracted from SC cell with the use of pressure cycling technology and in the presence of a protease inhibitor

Read more

Summary

Introduction

From the ‡Department of Computer Science and Engineering, University of California, San Diego, 9500, Gilman Drive, San Diego, California 92093; §Algorithmic Biology Laboratory, St. Most top-down studies were limited to single purified proteins [12,13,14,15]. Topdown studies of protein mixtures were restricted by difficulties in separating and fragmenting intact proteins and a shortage of robust computational tools. Because algorithms for interpreting topdown spectra are still in their infancy, many recent developments include computational innovations in protein identification. Every protein (possibly with modifications) can be scored against a top-down deconvoluted spectrum, resulting in a Protein-Spectrum-Match (PrSM). The top-down protein identification problem is finding a protein in a database with the highest scoring PrSM for a top-down spectrum and further output the PrSM if it is statistically significant. ProSightPC is a fast tool that reports the statistical significance of PrSMs

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.