Abstract

The cellular localization site and the potential functionality of a protein are closely related. In this paper, we develop a novel Double-SVM Classification System for predicting the subcellular localization sites of the proteins. First, a set of features are made from the occurrence frequency of sequence motifs. Then discriminant features are selected by I-RELIEF and used as the inputs of the support vector machine (SVM) for classification. The two classes are single and multiple-subcellular localizations. Due to the large size difference among the protein sequences, we set two SVMs, one for the shorter sequences and the other for the longer ones. This system is applied to predict the subcellular localization sites of Yeast proteins. The experimental result shows that the testing accuracy of the system is 66%, which is higher than that of the traditional single-SVM model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.