Abstract

In mass spectrometry-based proteomics, frequently hundreds of thousands of MS/MS spectra are collected in a single experiment. Of these, a relatively small fraction is confidently assigned to peptide sequences, whereas the majority of the spectra are not further analyzed. Spectra are not assigned to peptides for diverse reasons. These include deficiencies of the scoring schemes implemented in the database search tools, sequence variations (e.g. single nucleotide polymorphisms) or omissions in the database searched, post-translational or chemical modifications of the peptide analyzed, or the observation of sequences that are not anticipated from the genomic sequence (e.g. splice forms, somatic rearrangement, and processed proteins). To increase the amount of information that can be extracted from proteomic MS/MS datasets we developed a robust method that detects high quality spectra within the fraction of spectra unassigned by conventional sequence database searching and computes a quality score for each spectrum. We also demonstrate that iterative search strategies applied to such detected unassigned high quality spectra significantly increase the number of spectra that can be assigned from datasets and that biologically interesting new insights can be gained from existing data.

Highlights

  • In mass spectrometry-based proteomics, frequently hundreds of thousands of MS/MS spectra are collected in a single experiment

  • We demonstrate that by interrogating those unassigned high quality spectra more comprehensively using existing protein sequence databases and by searching against genomic databases, one can significantly increase the number of identified peptides, including peptides containing modifications and sequence polymorphisms

  • Acquired MS/MS spectra are searched against a protein sequence database using any of the currently available database search algorithms

Read more

Summary

Introduction

In mass spectrometry-based proteomics, frequently hundreds of thousands of MS/MS spectra are collected in a single experiment. A number of automated database search tools have been developed for that purpose, including commercial and open source programs (9 –17) These programs correlate the experimental MS/MS spectra with theoretical fragmentation patterns of peptides obtained from a sequence database and use various scoring schemes to find the best matching peptide sequence. This high throughput protein identification process, is prone to false positives resulting from incorrect peptide assignments to MS/MS spectra by the database search tools (5, 18 –21). Statistical approaches and computational tools were developed for assigning confidence measures to peptide and protein identifications and for estimating the false identification rates. Quality Assessment and Iterative Analysis of MS/MS Data and allow faster and more consistent analysis of large scale datasets (5)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.