Abstract

High efficiency identification of intact glycopeptides from a shotgun glycoproteomic LC-MS(2) dataset remains problematic. The prevalent mode of identifying the de-N-glycosylated peptides is littered with false positives and addresses only the issue of site occupancy. Here, we present Sweet-Heart, a computational tool set developed to tackle the heart of the problems in MS(2) sequencing of glycopeptide. It accepts low resolution and low accuracy ion trap MS(2) data, filters for glycopeptides, couples knowledge-based de novo interpretation of glycosylation-dependent fragmentation pattern with protein database search, and uses machine-learning algorithm to score the computed glyco and peptide combinations. Higher ranking candidates are then compiled into a list of MS(2)/MS(3) entries to drive subsequent rounds of targeted MS(3) sequencing of putative peptide backbone, allowing its validation by database search in a fully automated fashion. With additional fishing out of all related glycoforms and final data integration, the platform proves to be sufficiently sensitive and selective, conducive to novel glycosylation discovery, and robust enough to discriminate, among others, N-glycolyl neuraminic acid/fucose from N-acetyl neuraminic acid/hexose. A critical appraisal of its computing performance shows that Sweet-Heart allows high sensitivity comprehensive mapping of site-specific glycosylation for isolated glycoproteins and facilitates analysis of glycoproteomic data. The biological relevance of protein site-specific glycosylation cannot be meaningfully addressed without first defining its pattern by direct analysis of glycopeptides. Sweet-Heart is a novel suite of computational tools allowing for automated analysis of mass spectrometry-based glycopeptide sequencing data. It is developed to accept ion trap MS2/MS3 data and uses a machine learning algorithm to score and rank the candidate peptide core and glycosyl substituent combinations. By eliminating the need for manual, labor-intensive, and subjective data interpretation, it facilitates high throughput shotgun glycoproteomic data analysis and is conducive to identification of unanticipated glycosylation, as demonstrated here with a recombinant EGFR.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.