Abstract
This paper describes an algorithm to apply proteotypic peptide sequence libraries to protein identifications performed using tandem mass spectrometry (MS/MS). Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. Libraries of proteotypic peptide sequences were compiled from the Global Proteome Machine Database for Homo sapiens and Saccharomyces cerevisiae model species proteomes. These libraries were used to scan through collections of tandem mass spectra to discover which proteins were represented by the data sets, followed by detailed analysis of the spectra with the full protein sequences corresponding to the discovered proteotypic peptides. This algorithm (Proteotypic Peptide Profiling, or P3) resulted in sequence-to-spectrum matches comparable to those obtained by conventional protein identification algorithms using only full protein sequences, with a 20-fold reduction in the time required to perform the identification calculations. The proteotypic peptide libraries, the open source code for the implementation of the search algorithm and a website for using the software have been made freely available. Approximately 4% of the residues in the H. sapiens proteome were required in the proteotypic peptide library to successfully identify proteins.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.