Abstract
MS-based de novo peptide sequencing has been improved remarkably with significant development of mass-spectrometry and computational approaches but still lacks quality-control methods. Here we proposed a novel algorithm pSite to evaluate the confidence of each amino acid rather than the full-length peptides obtained by de novo peptide sequencing. A semi-supervised learning approach was used to discriminate correct amino acids from random one; then, an expectation-maximization algorithm was used to adaptively control the false amino-acid rate (FAR). On three test data sets, pSite recalled 86% more amino acids on average than PEAKS at the FAR of 5%. pSite also performed superiorly on the modification site localization problem, which is essentially a special case of amino acid confidence evaluation. On three phosphopeptide data sets, at the false localization rate of 1%, the average recall of pSite was 91% while those of Ascore and phosphoRS were 64 and 63%, respectively. pSite covered 98% of Ascore and phosphoRS results and contributed 21% more phosphorylation sites. Further analyses show that the use of distinct fragmentation features in high-resolution MS/MS spectra, such as neutral loss ions, played an important role in improving the precision of pSite. In summary, the effective and universal model together with the extensive use of spectral information makes pSite an excellent quality control tool for both de novo peptide sequencing and modification site localization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.