In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.
Read full abstract