Abstract
Liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based proteomics provides a wealth of information about proteins present in biological samples. In bottom-up LC-MS/MS-based proteomics, proteins are enzymatically digested into peptides prior to query by LC-MS/MS. Thus, the information directly available from the LC-MS/MS data is at the peptide level. If a protein-level analysis is desired, the peptide-level information must be rolled up into protein-level information. We propose a principal component analysis-based statistical method, ProPCA, for efficiently estimating relative protein abundance from bottom-up label-free LC-MS/MS data that incorporates both spectral count information and LC-MS peptide ion peak attributes, such as peak area, volume, or height. ProPCA may be used effectively with a variety of quantification platforms and is easily implemented. We show that ProPCA outperformed existing quantitative methods for peptide-protein roll-up, including spectral counting methods and other methods for combining LC-MS peptide peak attributes. The performance of ProPCA was validated using a data set derived from the LC-MS/MS analysis of a mixture of protein standards (the UPS2 proteomic dynamic range standard introduced by The Association of Biomolecular Resource Facilities Proteomics Standards Research Group in 2006). Finally, we applied ProPCA to a comparative LC-MS/MS analysis of digested total cell lysates prepared for LC-MS/MS analysis by alternative lysis methods and show that ProPCA identified more differentially abundant proteins than competing methods.
Highlights
Liquid chromatography-tandem mass spectrometry (LCMS/MS)-based proteomics provides a wealth of information about proteins present in biological samples
Protein Standards—The data set used to validate the performance of ProPCA was derived from the LC-MS/MS analysis of fractions of the UPS2 proteomic dynamic range standard set2
Data Processing Step—ProPCA relies on spectral count (SC) and peptide peak attributes (PPAs) measurements that must be extracted from the raw LCMS/MS data
Summary
Protein Identification by One-dimensional Nano-LC-Tandem Mass Spectrometry—A CTC Autosampler (LEAP Technologies) was equipped with two 10-port Valco valves and a 20-l injection loop. To obtain PPA measurements, features in the LC-MS domain were matched to peptides identified via MS/MS spectra with the aid of the AMT database. For both the UPS2 and HepG2 data, the non-default settings used for the matching procedure were “deltatimems1ms2” of 200 and minpprophet of 0.95. The supplemental data include information from .pepXML files and msInspect/AMT match files, which contain PPA measurements, for all UPS2 and HepG2 samples. When PPA measurements are available for all indices i and k (that is, there are no missing data), the ProPCA estimates correspond to the first principle component obtained by performing PCA on the data matrix, (yijk)i, k, for protein j. Software Availability—The R code for implementing ProPCA, given log(SC) and log(PPA) data, is included in the supplemental data and is available at http://www.hsph.harvard.edu/proteomics/software
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.