Abstract
Recent mass spectrometry (MS)-based techniques enable deep proteome coverage with relative quantitative analysis, resulting in increased identification of very weak signals accompanied by increased data size of liquid chromatography (LC)–MS/MS spectra. However, the identification of weak signals using an assignment strategy with poorer performance results in imperfect quantification with misidentification of peaks and ratio distortions. Manually annotating a large number of signals within a very large dataset is not a realistic approach. In this study, therefore, we utilized machine learning algorithms to successfully extract a higher number of peptide peaks with high accuracy and precision. Our strategy evaluated each peak identified using six different algorithms; peptide peaks identified by all six algorithms (i.e., unanimously selected) were subsequently assigned as true peaks, which resulted in a reduction in the false-positive rate. Hence, exact and highly quantitative peptide peaks were obtained, providing better performance than obtained applying the conventional criteria or using a single machine learning algorithm.
Highlights
Recent mass spectrometry (MS)-based techniques enable deep proteome coverage with relative quantitative analysis, resulting in increased identification of very weak signals accompanied by increased data size of liquid chromatography (LC)–MS/MS spectra
To identify the proteins involved in physiologic and/or pathologic processes based on abundance using shotgun proteomics, poor chromatographic peaks must be excluded from complex Liquid chromatography–mass spectrometry (LC–MS)/MS spectra when using conventional criteria, such as isotope dot product (idotP) and ∆M8–12, and quantifications based on the areas of extracted peaks of identified proteins can be compared
We introduced six machine learning algorithms to successfully extract a higher number of peptide peaks with high accuracy and precision
Summary
Recent mass spectrometry (MS)-based techniques enable deep proteome coverage with relative quantitative analysis, resulting in increased identification of very weak signals accompanied by increased data size of liquid chromatography (LC)–MS/MS spectra. A mass precision algorithm was developed to extract the signal from the noise, improving quantitation using a random forest (RF) classifier and heuristic s core[13] Another algorithm has been released that identifies quantitative peaks from interfering peaks or poor chromatograms in targeted proteomics using a supervised machine learning approach[14]. We adopted idotP and ∆M in addition to seven other informative features of chromatographic peaks We examined these features using six different types of supervised machine learning algorithms to individually extract the peptide peaks. Because unanimous agreement between all six algorithms leads to a reduction in the false-positive rate, the advantage of this system is that it enables extraction of more-exact and highly quantitative peptide peaks in comparison with a single supervised machine learning procedure or applying conventional criteria. We report an example of such quantitative comparisons using our unanimous peak assignment procedure
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.