Abstract

Crosslinking mass spectrometry has developed into a robust technique that is increasingly used to investigate the interactomes of organelles and cells. However, the incomplete and noisy information in the mass spectra of crosslinked peptides limits the numbers of protein–protein interactions that can be confidently identified. Here, we leverage chromatographic retention time information to aid the identification of crosslinked peptides from mass spectra. Our Siamese machine learning model xiRT achieves highly accurate retention time predictions of crosslinked peptides in a multi-dimensional separation of crosslinked E. coli lysate. Importantly, supplementing the search engine score with retention time features leads to a substantial increase in protein–protein interactions without affecting confidence. This approach is not limited to cell lysates and multi-dimensional separation but also improves considerably the analysis of crosslinked multiprotein complexes with a single chromatographic dimension. Retention times are a powerful complement to mass spectrometric information to increase the sensitivity of crosslinking mass spectrometry analyses.

Highlights

  • Crosslinking mass spectrometry has developed into a robust technique that is increasingly used to investigate the interactomes of organelles and cells

  • The resulting data were searched with an entrapment database approach (Fig. 1a) leading to 11,196 crosslinked peptide spectrum match (CSM) (11072 TT, 87 TD, 37 DD, Supplementary Fig. 3) at 1% CSM-false discovery rate (FDR), separating self and heteromeric CSMs16,39,40

  • The human entrapment database allows to assess error, independently of the target-decoy approach. This will play a critical role here as E. coli decoys will be used for the machine learning-based rescoring

Read more

Summary

Introduction

Crosslinking mass spectrometry has developed into a robust technique that is increasingly used to investigate the interactomes of organelles and cells. The final parameters for the Siamese network architecture for crosslinks were obtained by a small grid-search (6453 unique peptide-pairs at 1% CSM-FDR; Supplementary Fig. 5). RTs of crosslinked peptides can robustly be learned within a data set, making them available as features in a CSM rescoring framework.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call