Abstract

O-glycosylation is one of the main types of the mammalian protein glycosylation, which is serine or threonine specific, though any consensus sequence is still unknown. In this paper, a layered neural network and a support vector machine are used for the prediction of O-glycosylation sites. Three types of encoding for a protein sequence within a fixed size window are used as the input to the network, that is, a sparse coding which distinguishes all 20 amino acid residues, 5-letter coding and hydropathy coding. In the neural network, one output unit gives the prediction whether a particular site of serine or threonine is glycosylated, while SVM classifies into the 2 classes. The performance is evaluated by the Matthews correlation coefficient. The preliminary results on the neural network show the better performance of the sparse and 5-letter codings compared with the hydropathy coding, while the improvement according to the window size is shown to be limited to a certain extent by SVM.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.