Cross-linking mass spectrometry (XL-MS) allows characterizing protein-protein interactions (PPIs) in native biological systems by capturing cross-links between different proteins (inter-links). However, inter-link identification remains challenging, requiring dedicated data filtering schemes and thorough error control. Here, we benchmark existing data filtering schemes combined with error rate estimation strategies utilizing concatenated target-decoy protein sequence databases. These workflows show shortcomings either in sensitivity (many false negatives) or specificity (many false positives). To ameliorate the limited sensitivity without compromising specificity, we develop an alternative target-decoy search strategy using fused target-decoy databases. Furthermore, we devise a different data filtering scheme that takes the inter-link context of the XL-MS dataset into account. Combining both approaches maintains low error rates and minimizes false negatives, as we show by mathematical simulations, analysis of experimental ground-truth data, and application to various biological datasets. In human cells, inter-link identifications increase by 75% and we confirm their structural accuracy through proteome-wide comparisons to AlphaFold2-derived models. Taken together, target-decoy fusion and context-sensitive data filtering deepen and fine-tune XL-MS-based interactomics.
Read full abstract