Abstract
A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques.
Highlights
The most commonly used proteomics approach, shotgun proteomics, has become an invaluable tool for the highthroughput characterization of proteins in biological samples [1]. This workflow relies on the combination of protein digestion, liquid chromatography (LC)1 separation, tandem mass spectrometry (MS/MS), and sophisticated data analysis in its aim to derive an accurate and complete set of peptides and their inferred proteins that are present in the sample being studied
Because the formats generated by MSBlender and PepArML are different than iProphet-generated pepXML output, tools for parsing and processing the MSBlender and PepArML results had to be written; these were based on the Trans-Proteomic Pipeline (TPP) scripts for performing decoybased error rate calculations, reusing as much codebase as possible while adapting them to the unique tables and pepXML flavors generated by the non-TPP tools analyzed
We have reviewed the approaches and tools available for improving dataset analysis via combining multiple search engine results, and we compared different combinations of search engines, using iProphet, applied to the same dataset
Summary
The most commonly used proteomics approach, shotgun proteomics, has become an invaluable tool for the highthroughput characterization of proteins in biological samples [1]. This workflow relies on the combination of protein digestion, liquid chromatography (LC) separation, tandem mass spectrometry (MS/MS), and sophisticated data analysis in its aim to derive an accurate and complete set of peptides and their inferred proteins that are present in the sample being studied. The MS instrument acquires fragment ion spectra on a subset of the peptide precursor ions that it measures. From the MS/MS spectra that measure the abundance and mass of the peptide ion fragments, peptides
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.