PEPPeR, a Platform for Experimental Proteomic Pattern Recognition

Michael A. Gillette,Kyriacos C. Leptos,George M. Church,Steven A. Carr,Jacob D. Jaffe,D.R. Mani

doi:10.1074/mcp.m600222-mcp200

Michael A. Gillette, Kyriacos C. Leptos + Show 4 more

Open Access

https://doi.org/10.1074/mcp.m600222-mcp200

Copy DOI

Abstract

Quantitative proteomics holds considerable promise for elucidation of basic biology and for clinical biomarker discovery. However, it has been difficult to fulfill this promise due to over-reliance on identification-based quantitative methods and problems associated with chromatographic separation reproducibility. Here we describe new algorithms termed "Landmark Matching" and "Peak Matching" that greatly reduce these problems. Landmark Matching performs time base-independent propagation of peptide identities onto accurate mass LC-MS features in a way that leverages historical data derived from disparate data acquisition strategies. Peak Matching builds upon Landmark Matching by recognizing identical molecular species across multiple LC-MS experiments in an identity-independent fashion by clustering. We have bundled these algorithms together with other algorithms, data acquisition strategies, and experimental designs to create a Platform for Experimental Proteomic Pattern Recognition (PEPPeR). These developments enable use of established statistical tools previously limited to microarray analysis for treatment of proteomics data. We demonstrate that the proposed platform can be calibrated across 2.5 orders of magnitude and can perform robust quantification of ratios in both simple and complex mixtures with good precision and error characteristics across multiple sample preparations. We also demonstrate de novo marker discovery based on statistical significance of unidentified accurate mass components that changed between two mixtures. These markers were subsequently identified by accurate mass-driven MS/MS acquisition and demonstrated to be contaminant proteins associated with known proteins whose concentrations were designed to change between the two mixtures. These results have provided a real world validation of the platform for marker discovery.

Highlights

Quantitative proteomics holds considerable promise for elucidation of basic biology and for clinical biomarker discovery
Propagation of Identities by Landmark Matching—Landmark matching provides an alternative to traditional chromatographic alignment by using relative chromatographic elution order information and sequence-identified “landmarks” to assign peptide identities to LC-MS peaks and propagate them across multiple LC-MS experiments
This is achieved by performing a limited number of data-dependent MS/MS scan acquisitions in an LC-MS experiment that is primarily designed for chromatographic resolution and quantification

Summary

EXPERIMENTAL PROCEDURES

Information about confidently identified peptides is retained in the BASIS SET, namely peptide sequence, charge in which it was observed, experiment in which it was observed, and scan boundaries composing the MS/MS scans grouped together for sequence identification These scan boundaries become the basis for absolute or relative retention time comparisons in later landmark matching. Peptides sequenced during the CURRENT EXPERIMENT are mapped onto features identified by MapQuant in that experiment using a loose m/z matching TOLERANCE (⫾25 ppm) and an absolute retention time RADIUS (typically 0.3 min). Any directly sequenced features for that experiment (LANDMARKS) that were not included in the final match list are merged back into the data set.

Experiments

RESULTS

DISCUSSION