Abstract

Local protein structure prediction is a central task in bioinformatics research. Local protein structure prediction can be transformed into the multiclass problem for huge datasets. In previous study, multiclass clustering support vector machines (CSVMs) was proposed for local protein structure prediction. The greedy algorithm is utilized to select the next closest class if CSVM modeled for the assigned class predicts the sequence segment as negative. However, the greedy algorithm may not be optimal. If all CSVM predict the sequence segment as negative, this sequence segment cannot be classified. In order to further improve performance of the multiclass problem, we propose fuzzy clustering support vector machines (FCSVMs) in this study. The FCSVMs model calculates the class membership value of the given sequence segment for each class and assigns the representative structure of the finally selected class to the sequence segment. Values of the fuzzy membership function are based on testing accuracy of decision function outputs from FCSVMs. Under this mechanism, values of different fuzzy membership functions can be compared. FCSVMs are built specifically for each class partitioned intelligently by the clustering algorithm. This feature makes learning tasks for each FCSVM more specific and simpler. Furthermore, FCSVM modeled for each class can be easily parallelized to handle the complex multiclass problems for huge datasets. Using fuzzy membership functions, all sequence segments can be classified. Compared with the conventional clustering algorithm and CSVMs, testing accuracy for local structure prediction has been improved noticeably when the FCSVMs model is applied.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call