Reliable automatic protein identification from matrix-assisted laser desorption/ionization mass spectrometric peptide fingerprints.

Peter Berndt,Uwe Hobohm,Hanno Langen

doi:10.1002/(sici)1522-2683(19991201)20:18<3521::aid-elps3521>3.0.co;2-8

Peter Berndt, Uwe Hobohm + Show 1 more

https://doi.org/10.1002/(sici)1522-2683(19991201)20:18<3521::aid-elps3521>3.0.co;2-8

Copy DOI

Abstract

Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry of protein samples from two-dimensional (2-D) gels in conjunction with protein sequence database searches is frequently used to identify proteins. Moreover, the automatic analysis of complete 2-D gels with hundreds and even thousands of protein spots ("proteome analysis") is possible, without human intervention, with the availability of highly accurate mass spectrometry instruments, and high-throughput facilities for preparation and handling of protein samples from 2-D gels. However, the lack of software for precise automatic analysis and annotation of mass spectra, as well as software for in-batch sequence database queries, is increasingly becoming a significant bottleneck for the proteomics work flow. In the present paper we outline an algorithm for reliable, accurate, and automatic evaluation of mass spectrometric data and database searches. We show here that simply selecting from the sequence database the protein that has the most matching fragment masses often leads to false-positive results. Reliable protein identification is dependent on several parameters: the accuracy of fragment mass determination, the number of masses submitted for query, the mass distribution of query masses, the number of masses matching between sample and database protein, the size of the sequence database, and the kind and number of modifications considered. Using these parameters, we derive a simple statistical estimation that can be used to calculate the probability of true-positive protein identification.

Full Text