Abstract
O-glycosylation of the mammalian protein is studied. It is serine or threonine specific, though any consensus sequence is still unknown. We have been applied support vector machines (SVM) for the prediction of O-glycosylation sites from various kinds of protein information, aiming to investigate a glycosylation condition and elucidate the mechanisms. In the present study, we focus on the distribution of the glycosylation sites. Many O-glycosylated sites are observed in clusters of closely spaced glycosylated sites, whereas the other sites are found sparsely or isolated. These two types of crowded and isolated sites may have different glycosylation mechanisms. We divide the whole O-glycosylation sites into the crowded and the isolated groups. For each group, SVM is trained to predict the O-glycosylation sites separately. The prediction results of two groups have different input information dependency. The results indicate that some motifs are expected for the isolated group, while the interaction between the glycosylated sites and the relative proportion of the surrounding amino acids affect the glycosylation for the crowded group. Then, we also compare the statistics of amino acid sequences around the glycosylation sites of both groups. As the results, some amino acids (proline, valine, alanine etc.) have high existence probabilities at each specific positions relative to a glycosylation site, especially for the isolated glycosylation. Moreover, independent component analysis (ICA) for the amino acid sequences elucidates position specific existences of the above amino acids, including well known proline at -1 and +3, which are found as different independent components.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.