Abstract

Recent advances in the liquid chromatography/mass spectrometry (LC/MS) technology have improved the sensitivity, resolution, and speed of proteome analysis, resulting in increasing demand for more sophisticated algorithms to interpret complex mass spectrograms. Here, we propose a novel statistical method, proteomic mass spectrogram decomposition (ProtMSD), for joint identification and quantification of peptides and proteins. Given the proteomic mass spectrogram and the reference mass spectra of all possible peptide ions associated with proteins as a dictionary, ProtMSD estimates the chromatograms of those peptide ions under a group sparsity constraint without using the conventional careful preprocessing (e.g., thresholding and peak picking). We show that the method was significantly improved using protein-peptide hierarchical relationships, isotopic distribution profiles, reference retention times of peptide ions, and prelearned mass spectra of noise. We examined the concept of database search, library search, and match-between-runs. Our ProtMSD showed excellent agreements of 3277 peptide ions (94.79%) and 493 proteins (98.21%) with Mascot/Skyline for an Escherichia coli proteome sample and of 4460 peptide ions (103%) and 588 proteins (101%) with match-between-runs by MaxQuant for a yeast proteome sample. This is the first attempt to use a matrix decomposition technique as a tool for LC/MS-based proteome identification and quantification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call