Abstract

BackgroundA real-time peptide-spectrum matching (RT-PSM) algorithm is a database search method to interpret tandem mass spectra (MS/MS) with strict time constraints. Restricted by the hardware and architecture of individual workstation, previous RT-PSM algorithms either are not fast enough to satisfy all real-time system requirements or need to sacrifice the level of inference accuracy to provide the required processing speed.ResultsWe develop two parallelized algorithms for MS/MS data analysis: a multi-core RT-PSM (MC RT-PSM) algorithm which works on individual workstations and a distributed computing RT-PSM (DC RT-PSM) algorithm which works on a computer cluster. Two data sets are employed to evaulate the performance of our proposed algorithms. The simulation results show that our proposed algorithms can reach approximately 216.9-fold speedup on a sub-task process (similarity scoring module) and 84.78-fold speedup on the overall process compared with a single-thread process of the RT-PSM algorithm when 240 logical cores are employed.ConclusionsThe improved RT-PSM algorithms can achieve the processing speed requirement without sacrificing the level of inference accuracy. With some configuration adjustments, the proposed algorithm can support many peptide identification programs, such as X!Tandem, CUDA version RT-PSM, etc.

Highlights

  • A real-time peptide-spectrum matching (RT-peptide-spectrum matches (PSMs)) algorithm is a database search method to interpret tandem mass spectra (MS/MS) with strict time constraints

  • We develop an improved peptide identification procedure on a computer cluster based on the real-time peptide-spectrum matching (RT-PSM) algorithm proposed by Wu et al in [1]

  • Dataset A is the one used in the RT-PSM package [1]: the Tandem mass spectra (MS/MS) spectrum experimental data source includes 2058 group spectrum data and the protein database is taken from a subset of the UniRef100 human protein database

Read more

Summary

Introduction

A real-time peptide-spectrum matching (RT-PSM) algorithm is a database search method to interpret tandem mass spectra (MS/MS) with strict time constraints. Tandem mass spectrometry (MS/MS) has been widely used in the early detection of diseases, chemical analysis and pharmaceutical industry. It can efficiently identify and characterize the protein component information in complex biological mixtures. Interpretations of MS/MS spectra need to perform peptide-spectrum matches (PSMs) by searching experimental MS/MS spectra against a protein sequence database. In order to improve the efficiency and the accuracy of MS/MS experiments, a real-time peptide identification procedure needs to be involved in a mass spectrometry system which analyzes peptides and performs the PSMs in a peptide identification procedure life-circle.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call