Protein glycosylation is a highly heterogeneous post-translational modification that has been demonstrated to exhibit significant variations in various diseases. Due to the differential patterns observed in disease and healthy populations, the glycosylated proteins hold promise as early indicators for multiple diseases. With the continuous development of liquid chromatography-mass spectrometry (LC-MS) technology and spectrum analysis software, the sensitivity for the decipher of the tandem mass spectra of the glycopeptides carrying intact glycans, i.e., intact glycopeptides, enzymatic hydrolyzed from glycoproteins has been significantly improved. From quantified intact glycopeptides, the difference of protein glycosylation at multiple levels, e.g., glycoprotein, glycan, glycosite, and site-specific glycans, could be obtained for different samples. However, the manual analysis of the intact glycopeptide quantitative data at multiple levels is tedious and time consuming. In this study, we have developed a software tool named "GP-Marker" to facilitate large-scale data mining of spectra dataset of intact N-glycopeptide at multiple levels. This software provides a user-friendly and interactive interface, offering operational tools for machine learning to researchers without programming backgrounds. It includes a range of visualization plots displaying differential glycosylation and provides the ability to extract multi-level data analysis from intact glycopeptide data quantified by Glyco-Decipher.
Read full abstract