The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra

Ignat V Shilov,Sean L Seymour,Alpesh A Patel,Alex Loboda,Wilfred H Tang,Sean P Keating,Christie L Hunter,Lydia M Nuwaysir,Daniel A Schaeffer

doi:10.1074/mcp.t600050-mcp200

Abstract

The Paragon Algorithm, a novel database search engine for the identification of peptides from tandem mass spectrometry data, is presented. Sequence Temperature Values are computed using a sequence tag algorithm, allowing the degree of implication by an MS/MS spectrum of each region of a database to be determined on a continuum. Counter to conventional approaches, features such as modifications, substitutions, and cleavage events are modeled with probabilities rather than by discrete user-controlled settings to consider or not consider a feature. The use of feature probabilities in conjunction with Sequence Temperature Values allows for a very large increase in the effective search space with only a very small increase in the actual number of hypotheses that must be scored. The algorithm has a new kind of user interface that removes the user expertise requirement, presenting control settings in the language of the laboratory that are translated to optimal algorithmic settings. To validate this new algorithm, a comparison with Mascot is presented for a series of analogous searches to explore the relative impact of increasing search space probed with Mascot by relaxing the tryptic digestion conformance requirements from trypsin to semitrypsin to no enzyme and with the Paragon Algorithm using its Rapid mode and Thorough mode with and without tryptic specificity. Although they performed similarly for small search space, dramatic differences were observed in large search space. With the Paragon Algorithm, hundreds of biological and artifact modifications, all possible substitutions, and all levels of conformance to the expected digestion pattern can be searched in a single search step, yet the typical cost in search time is only 2-5 times that of conventional small search space. Despite this large increase in effective search space, there is no drastic loss of discrimination that typically accompanies the exploration of large search space.

Highlights

The ParagonTM Algorithm, a novel database search engine for the identification of peptides from tandem mass spectrometry data, is presented
This study presents a new software technology for the identification of peptides from tandem mass spectra called the ParagonTM Algorithm, hereafter referred to interchangeably as “Paragon.” The most common application for this class of software tools is so-called “shotgun” or “bottom-up” proteomics experiments [1] where a protein mixture of any complexity is digested with a proteolytic enzyme or reagent, the peptides are analyzed by tandem mass spectrometry, and software of this type is used to identify the peptides [2, 3] and, by inference, determine which proteins have been detected in the mixture [4]
The Paragon Algorithm and this study focus on the peptide identification process

Summary

Introduction

The ParagonTM Algorithm, a novel database search engine for the identification of peptides from tandem mass spectrometry data, is presented. Sequence segments with very “hot” STVs are searched addressing very large search space such that peptide hypotheses containing lower probability features such as unlikely modifications and unexpected cleavages will be considered.

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Molecular & Cellular Proteomics	Publication Date: Sep 1, 2007
Citations: 1228	License type: cc-by

R Discovery Prime

R Discovery Prime

The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecular & Cellular Proteomics

Lead the way for us

Similar Papers

LOG-Means
Manuel Fritz ... Michael Behringer
Proceedings of the VLDB Endowment | VOL. 13
Manuel Fritz, et. al.Manuel Fritz ... Michael Behringer
01 Aug 2020
Proceedings of the VLDB Endowment | VOL. 13

A Cooperative Multilevel Tabu Search Algorithm for the Covering Design Problem
Chaoying Dai ... (Ben) Pak Ching Li
-
Chaoying Dai, et. al.Chaoying Dai ... (Ben) Pak Ching Li
01 Jan 2006
01 Jan 2006

LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search
Bhavna Gopal ... Yiran Chen
-
Bhavna Gopal, et. al.Bhavna Gopal ... Yiran Chen
01 Aug 2023
01 Aug 2023

An Improved Adaptive Artificial Bee Colony Algorithm
Liying He ... Qingyuan Bai
-
Liying He, et. al.Liying He ... Qingyuan Bai
01 Jan 2014
01 Jan 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecular &amp; Cellular Proteomics

More From: Molecular & Cellular Proteomics