Abstract

A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques.

Highlights

  • The most commonly used proteomics approach, shotgun proteomics, has become an invaluable tool for the highthroughput characterization of proteins in biological samples [1]. This workflow relies on the combination of protein digestion, liquid chromatography (LC)1 separation, tandem mass spectrometry (MS/MS), and sophisticated data analysis in its aim to derive an accurate and complete set of peptides and their inferred proteins that are present in the sample being studied

  • Because the formats generated by MSBlender and PepArML are different than iProphet-generated pepXML output, tools for parsing and processing the MSBlender and PepArML results had to be written; these were based on the Trans-Proteomic Pipeline (TPP) scripts for performing decoybased error rate calculations, reusing as much codebase as possible while adapting them to the unique tables and pepXML flavors generated by the non-TPP tools analyzed

  • We have reviewed the approaches and tools available for improving dataset analysis via combining multiple search engine results, and we compared different combinations of search engines, using iProphet, applied to the same dataset

Read more

Summary

Introduction

The most commonly used proteomics approach, shotgun proteomics, has become an invaluable tool for the highthroughput characterization of proteins in biological samples [1]. This workflow relies on the combination of protein digestion, liquid chromatography (LC) separation, tandem mass spectrometry (MS/MS), and sophisticated data analysis in its aim to derive an accurate and complete set of peptides and their inferred proteins that are present in the sample being studied. The MS instrument acquires fragment ion spectra on a subset of the peptide precursor ions that it measures. From the MS/MS spectra that measure the abundance and mass of the peptide ion fragments, peptides

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call