Abstract

A major limitation in identifying peptides from complex mixtures by shotgun proteomics is the ability of search programs to accurately assign peptide sequences using mass spectrometric fragmentation spectra (MS/MS spectra). Manual analysis is used to assess borderline identifications; however, it is error-prone and time-consuming, and criteria for acceptance or rejection are not well defined. Here we report a Manual Analysis Emulator (MAE) program that evaluates results from search programs by implementing two commonly used criteria: 1) consistency of fragment ion intensities with predicted gas phase chemistry and 2) whether a high proportion of the ion intensity (proportion of ion current (PIC)) in the MS/MS spectra can be derived from the peptide sequence. To evaluate chemical plausibility, MAE utilizes similarity (Sim) scoring against theoretical spectra simulated by MassAnalyzer software (Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908-3922) using known gas phase chemical mechanisms. The results show that Sim scores provide significantly greater discrimination between correct and incorrect search results than achieved by Sequest XCorr scoring or Mascot Mowse scoring, allowing reliable automated validation of borderline cases. To evaluate PIC, MAE simplifies the DTA text files summarizing the MS/MS spectra and applies heuristic rules to classify the fragment ions. MAE output also provides data mining functions, which are illustrated by using PIC to identify spectral chimeras, where two or more peptide ions were sequenced together, as well as cases where fragmentation chemistry is not well predicted.

Highlights

  • A major limitation in identifying peptides from complex mixtures by shotgun proteomics is the ability of search programs to accurately assign peptide sequences using mass spectrometric fragmentation spectra (MS/MS spectra)

  • Processing of Vendor Software-generated DTA Files to Simplify MS/MS Spectra—Our first goal was to test whether MassAnalyzer theoretical spectra could be used in place of manual analysis to evaluate chemical plausibility

  • For optimum alignment of the two spectra, we must take into account the processing that occurs when raw data files are created during the typical data collection mode used for high throughput proteomics profiling as well as the way that the information is extracted from those files by the vendor software in creating the text DTA files that summarize the MS/MS spectral information

Read more

Summary

Introduction

A major limitation in identifying peptides from complex mixtures by shotgun proteomics is the ability of search programs to accurately assign peptide sequences using mass spectrometric fragmentation spectra (MS/MS spectra). Methods have been developed to specify thresholds for acceptance either by searching datasets against an inverted sequence database of similar size to identify false positive thresholds (4) or by statistical analysis of multiple scores and results from normal searches (4, 5). Using methods such as these, limits on search program scores or combinations of scores can be set to yield a low number of false positives (6), but they will produce large false negative rates (4, 7). Manual analysis lacks uniform criteria, it can be error prone (17), and it is impractical with large datasets

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call