Abstract

Although mass spectrometry is well-suited to identifying thousands of possible protein post-translational modifications (PTMs), it has historically been biased towards just a few. To measure the entire set of PTMs across diverse proteomes, software must overcome the dual challenges of searching enormous search spaces and distinguishing correct from incorrect spectrum interpretations. Here, we describe TagGraph, a computational tool that overcomes both challenges with an unrestricted string-based search method that is as much as 350-fold faster than existing approaches, and a probabilistic validation model we optimized for PTM assignments. We applied TagGraph to a published human proteomic data set of 25 million mass spectra and tripled confident spectrum identifications compared its original analysis. We identified thousands of modification types on almost one million sites in the proteome. We show new contexts for highly abundant yet understudied PTMs such as proline hydroxylation, and its unexpected association with cancer mutations. By enabling broad PTM characterization TagGraph informs how their functions and regulation intersect.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call