Protein Identification by Tandem Mass Spectrometry and Sequence Database Searching

Alexey I Nesvizhskii

doi:10.1385/1-59745-275-0:87

Abstract

The shotgun proteomics strategy, based on digesting proteins into peptides and sequencing them using tandem mass spectrometry (MS/MS), has become widely adopted. The identification of peptides from acquired MS/MS spectra is most often performed using the database search approach. We provide a detailed description of the peptide identification process and review the most commonly used database search programs. The appropriate choice of the search parameters and the sequence database are important for successful application of this method, and we provide general guidelines for carrying out efficient analysis of MS/MS data. We also discuss various reasons why database search tools fail to assign the correct sequence to many MS/MS spectra, and draw attention to the problem of false-positive identifications that can significantly diminish the value of published data. To assist in the evaluation of peptide assignments to MS/MS spectra, we review the scoring schemes implemented in most frequently used database search tools. We also describe statistical approaches and computational tools for validating peptide assignments to MS/MS spectra, including the concept of expectation values, reversed database searching, and the empirical Bayesian analysis of PeptideProphet. Finally, the process of inferring the identities of the sample proteins given the list of peptide identifications is outlined, and the limitations of shotgun proteomics with regard to discrimination between protein isoforms are discussed.

Full Text