Abstract

Peptide mass fingerprinting, regardless of becoming complementary to tandem mass spectrometry for protein identification, is still the subject of in-depth study because of its higher sample throughput, higher level of specificity for single peptides and lower level of sensitivity to unexpected post-translational modifications compared with tandem mass spectrometry. In this study, we propose, implement and evaluate a uniform approach using support vector machines to incorporate individual concepts and conclusions for accurate PMF. We focus on the inherent attributes and critical issues of the theoretical spectrum (peptides), the experimental spectrum (peaks) and spectrum (masses) alignment. Eighty-one feature-matching patterns derived from cleavage type, uniqueness and variable masses of theoretical peptides together with the intensity rank of experimental peaks were proposed to characterize the matching profile of the peptide mass fingerprinting procedure. We developed a new strategy including the participation of matched peak intensity redistribution to handle shared peak intensities and 440 parameters were generated to digitalize each feature-matching pattern. A high performance for an evaluation data set of 137 items was finally achieved by the optimal multi-criteria support vector machines approach, with 491 final features out of a feature vector of 35,640 normalized features through cross training and validating a publicly available "gold standard" peptide mass fingerprinting data set of 1733 items. Compared with the Mascot, MS-Fit, ProFound and Aldente algorithms commonly used for MS-based protein identification, the feature-matching patterns algorithm has a greater ability to clearly separate correct identifications and random matches with the highest values for sensitivity (82%), precision (97%) and F1-measure (89%) of protein identification. Several conclusions reached via this research make general contributions to MS-based protein identification. Firstly, inherent attributes showed comparable or even greater robustness than other explicit. As an inherent attribute of an experimental spectrum, peak intensity should receive considerable attention during protein identification. Secondly, alignment between intense experimental peaks and properly digested, unique or non-modified theoretical peptides is very likely to occur in positive peptide mass fingerprinting. Finally, normalization by several types of harmonic factors, including missed cleavages and mass modification, can make important contributions to the performance of the procedure.

Highlights

  • Several conclusions reached via this research make general contributions to MS-based protein identification

  • In MS-based proteomics, MS1 or MS2, or even MSn, data for peptides produced by proteolysis are obtained and used for peptide mass fingerprinting (PMF),1 peptide fragment fingerprinting (PFF), and de novo sequencing for qualitative analysis or quantification of proteins

  • The PMF method would become more attractive in proteomics research if we could improve the accuracy of protein identification

Read more

Summary

Introduction

Several conclusions reached via this research make general contributions to MS-based protein identification. In MS-based proteomics, MS1 or MS2, or even MSn, data for peptides produced by proteolysis are obtained and used for peptide mass fingerprinting (PMF), peptide fragment fingerprinting (PFF), and de novo sequencing for qualitative analysis or quantification of proteins. Feature-matching Pattern-based SVM for Robust PMF cific than PMF for the analysis of a single peptide. The PMF method would become more attractive in proteomics research if we could improve the accuracy of protein identification. With this motivation, several bioinformatics methods and tools have been developed and improved to identify proteins using PMF data. Henzel et al [4] and Palagi et al [5] have provided excellent reviews of the evolution of PMF as a method for protein identification

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call