Peptide-spectrum Matches Research Articles

Cross-linking tandem mass spectrometry (XL-MS/MS) is an established analytical platform used to determine distance constraints between residues within a protein or from physically interacting proteins, thus improving our understanding of protein structure and function. To aid biological discovery with XL-MS/MS, it is essential that pairs of chemically linked peptides be accurately identified, a process that requires: (i) database search, that creates a ranked list of candidate peptide pairs for each experimental spectrum and (ii) false discovery rate (FDR) estimation, that determines the probability of a false match in a group of top-ranked peptide pairs with scores above a given threshold. Currently, the only available FDR estimation mechanism in XL-MS/MS is the target-decoy approach (TDA). However, despite its simplicity, TDA has both theoretical and practical limitations that impact the estimation accuracy and increase run time over potential decoy-free approaches (DFAs). We introduce a novel decoy-free framework for FDR estimation in XL-MS/MS. Our approach relies on multi-sample mixtures of skew normal distributions, where the latent components correspond to the scores of correct peptide pairs (both peptides identified correctly), partially incorrect peptide pairs (one peptide identified correctly, the other incorrectly), and incorrect peptide pairs (both peptides identified incorrectly). To learn these components, we exploit the score distributions of first- and second-ranked peptide-spectrum matches for each experimental spectrum and subsequently estimate FDR using a novel expectation-maximization algorithm with constraints. We evaluate the method on ten datasets and provide evidence that the proposed DFA is theoretically sound and a viable alternative to TDA owing to its good performance in terms of accuracy, variance of estimation, and run time. https://github.com/shawn-peng/xlms.

Read full abstract

Discovering noncanonical peptides has been a common application of proteogenomics. Recent studies suggest that certain noncanonical peptides, known as noncanonical major histocompatibility complex-I (MHC-I)-associated peptides (ncMAPs), that bind to MHC-I may make good immunotherapeutic targets. De novo peptide sequencing is a great way to find ncMAPs since it can detect peptide sequences from their tandem mass spectra without using any sequence databases. However, this strategy has not been widely applied for ncMAP identification because there is not a good way to estimate its false-positive rates. In order to completely and accurately identify immunopeptides using de novo peptide sequencing, we describe a unique pipeline called proteomics X genomics. In contrast to current pipelines, it makes use of genomic data, RNA-Seq abundance and sequencing quality, in addition to proteomic features to increase the sensitivity and specificity of peptide identification. We show that the peptide-spectrum match quality and genetic traits have a clear relationship, showing that they can be utilized to evaluate peptide-spectrum matches. From 10 samples, we found 24,449 canonical MHC-I–associated peptides and 956 ncMAPs by using a target-decoy competition. Three hundred eighty-seven ncMAPs and 1611 canonical MHC-I–associated peptides were new identifications that had not yet been published. We discovered 11 ncMAPs produced from a squirrel monkey retrovirus in human cell lines in addition to the two ncMAPs originating from a complementarity determining region 3 in an antibody thanks to the unrestricted search space assumed by de novo sequencing. These entirely new identifications show that proteomics X genomics can make the most of de novo peptide sequencing’s advantages and its potential use in the search for new immunotherapeutic targets.

Read full abstract

Peptide-spectrum Matches Research Articles

Related Topics

Articles published on Peptide-spectrum Matches

A multi-species benchmark for training and validating mass spectrometry proteomics machine learning models

NovoBoard: A Comprehensive Framework for Evaluating the False Discovery Rate and Accuracy of De Novo Peptide Sequencing

Dear-PSM: A deep learning-based peptide search engine enables full database search for proteomics.

NeoMS: Mass Spectrometry-based Method for Uncovering Mutated MHC-I Neoantigens.

An algorithm for decoy-free false discovery rate estimation in XL-MS/MS proteomics.

Comparative analysis of plasma affinity depletion methods: Impact on protein composition and phosphopeptide abundance in human plasma.

High-Throughput Proteomics Enabled by a Fully Automated Dual-Trap and Dual-Column LC-MS.

APIR: Aggregating Universal Proteomics Database Search Algorithms for Peptide Identification with FDR Control.

Rescoring Peptide Spectrum Matches: Boosting Proteomics Performance by Integrating Peptide Property Predictors Into Peptide Identification

Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF

Comprehensive shotgun proteomic characterization and virulence factors of seafood spoilage bacteria

MS2Rescore 3.0 Is a Modular, Flexible, and User-Friendly Platform to Boost Peptide Identifications, as Showcased with MS Amanda 3.0.

Unraveling the Intraday Variations in the Tear Fluid Proteome.

PXg: Comprehensive Identification of Noncanonical MHC-I–Associated Peptides From De Novo Peptide Sequencing Using RNA-Seq Reads

Introducing π-HelixNovo for practical large-scale de novo peptide sequencing.

The proteomic landscape of sperm surface deciphers its maturational and functional aspects in buffalo.

Making MS Omics Data ML-Ready: SpeCollate Protocols.

Quantifying the Impact of the Peptide Identification Framework on the Results of Fast Photochemical Oxidation of Protein Analysis.

Test-Time Training for Deep MS/MS Spectrum Prediction Improves Peptide Identification.

Deep Learning Prediction Boosts Phosphoproteomics-Based Discoveries Through Improved Phosphopeptide Identification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Peptide-spectrum Matches Research Articles

Related Topics

Articles published on Peptide-spectrum Matches

A multi-species benchmark for training and validating mass spectrometry proteomics machine learning models

NovoBoard: A Comprehensive Framework for Evaluating the False Discovery Rate and Accuracy of De Novo Peptide Sequencing

Dear-PSM: A deep learning-based peptide search engine enables full database search for proteomics.

NeoMS: Mass Spectrometry-based Method for Uncovering Mutated MHC-I Neoantigens.

An algorithm for decoy-free false discovery rate estimation in XL-MS/MS proteomics.

Comparative analysis of plasma affinity depletion methods: Impact on protein composition and phosphopeptide abundance in human plasma.

High-Throughput Proteomics Enabled by a Fully Automated Dual-Trap and Dual-Column LC-MS.

APIR: Aggregating Universal Proteomics Database Search Algorithms for Peptide Identification with FDR Control.

Rescoring Peptide Spectrum Matches: Boosting Proteomics Performance by Integrating Peptide Property Predictors Into Peptide Identification

Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF

Comprehensive shotgun proteomic characterization and virulence factors of seafood spoilage bacteria

MS2Rescore 3.0 Is a Modular, Flexible, and User-Friendly Platform to Boost Peptide Identifications, as Showcased with MS Amanda 3.0.

Unraveling the Intraday Variations in the Tear Fluid Proteome.

PXg: Comprehensive Identification of Noncanonical MHC-I–Associated Peptides From De Novo Peptide Sequencing Using RNA-Seq Reads

Introducing π-HelixNovo for practical large-scale de novo peptide sequencing.

The proteomic landscape of sperm surface deciphers its maturational and functional aspects in buffalo.

Making MS Omics Data ML-Ready: SpeCollate Protocols.

Quantifying the Impact of the Peptide Identification Framework on the Results of Fast Photochemical Oxidation of Protein Analysis.

Test-Time Training for Deep MS/MS Spectrum Prediction Improves Peptide Identification.

Deep Learning Prediction Boosts Phosphoproteomics-Based Discoveries Through Improved Phosphopeptide Identification